Semantic role labeling with pretrained language models for known and unknown predicates

Daniil Larionov, Elena Chistova, Artem Shelmanov, Ivan Smirnov

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    12 Citations (Scopus)

    Abstract

    We build the first full pipeline for semantic role labelling of Russian texts. The pipeline implements predicate identification, argument extraction, argument classification (labeling), and global scoring via integer linear programming. We train supervised neural network models for argument classification using Russian semantically annotated corpus - FrameBank. However, we note that this resource provides annotations only to a very limited set of predicates. We combat the problem of annotation scarcity by introducing two models that rely on different sets of features: one for “known” predicates that are present in the training set and one for “unknown” predicates that are not. We show that the model for “unknown” predicates can alleviate the lack of annotation by using pretrained embeddings. We perform experiments with various types of embeddings including the ones generated by deep pretrained language models: word2vec, FastText, ELMo, BERT, and show that embeddings generated by deep pretrained language models are superior to classical shallow embeddings for argument classification of both “known” and “unknown” predicates.

    Original languageEnglish
    Title of host publicationInternational Conference on Recent Advances in Natural Language Processing in a Deep Learning World, RANLP 2019 - Proceedings
    EditorsGalia Angelova, Ruslan Mitkov, Ivelina Nikolova, Irina Temnikova, Irina Temnikova
    PublisherIncoma Ltd
    Pages619-628
    Number of pages10
    ISBN (Electronic)9789544520557
    DOIs
    Publication statusPublished - 2019
    Event12th International Conference on Recent Advances in Natural Language Processing, RANLP 2019 - Varna, Bulgaria
    Duration: 2 Sep 20194 Sep 2019

    Publication series

    NameInternational Conference Recent Advances in Natural Language Processing, RANLP
    Volume2019-September
    ISSN (Print)1313-8502

    Conference

    Conference12th International Conference on Recent Advances in Natural Language Processing, RANLP 2019
    Country/TerritoryBulgaria
    CityVarna
    Period2/09/194/09/19

    Fingerprint

    Dive into the research topics of 'Semantic role labeling with pretrained language models for known and unknown predicates'. Together they form a unique fingerprint.

    Cite this