КЛАССИФИКАЦИЯ РИТОРИЧЕСКИХ ОТНОШЕНИЙ ДЛЯ ДИСКУРСИВНОГО АНАЛИЗА ТЕКСТОВ НА РУССКОМ ЯЗЫКЕ

Translated title of the contribution: Classification models for rsT discourse parsing of texts in Russian

E. V. Chistova, A. O. Shelmanov, M. V. Kobozeva, D. B. Pisarevskaya, I. V. Smirnov, S. Yu Toldova

    Research output: Contribution to journalConference articlepeer-review

    6 Citations (Scopus)

    Abstract

    The paper considers the task of automatic discourse parsing of texts in Russian. Discourse parsing is a well-known approach to capturing text semantics across boundaries of single sentences. Discourse annotation was found to be useful for various tasks including summarization, sentiment analysis, question-answering. Recently, the release of manually annotated Ru-RSTreebank corpus unlocked the possibility of leveraging supervised machine learning techniques for creating such parsers for Russian language. The corpus provides the discourse annotation in a widely adopted formalisation—Rhetorical Structure Theory. In this work, we develop feature sets for rhetorical relation classification in Russian-language texts, investigate importance of various types of features, and report results of the first experimental evaluation of machine learning models trained on Ru-RSTreebank corpus. We consider various machine learning methods including gradient boosting, neural network, and ensembling of several models by soft voting.

    Translated title of the contributionClassification models for rsT discourse parsing of texts in Russian
    Original languageRussian
    Pages (from-to)163-176
    Number of pages14
    JournalKomp'juternaja Lingvistika i Intellektual'nye Tehnologii
    Volume2019-May
    Issue number18
    Publication statusPublished - 2019
    Event2019 Annual International Conference on Computational Linguistics and Intellectual Technologies, Dialogue 2019 - Moscow, Russian Federation
    Duration: 29 May 20191 Jun 2019

    Fingerprint

    Dive into the research topics of 'Classification models for rsT discourse parsing of texts in Russian'. Together they form a unique fingerprint.

    Cite this