Combining lexical substitutes in neural word sense induction

Nikolay Arefyev, Boris Sheludko, Alexander Panchenko

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    3 Citations (Scopus)

    Abstract

    Word Sense Induction (WSI) is the task of grouping of occurrences of an ambiguous word according to their meaning. In this work, we improve the approach to WSI proposed by Amrami and Goldberg (2018) based on clustering of lexical substitutes for an ambiguous word in a particular context obtained from neural language models. Namely, we propose methods for combining information from left and right context and similarity to the ambiguous word, which result in generating more accurate substitutes than the original approach. Our simple yet efficient improvement establishes a new state-of-the-art on WSI datasets for two languages. Besides, we show improvements to the original approach on a lexical substitution dataset.

    Original languageEnglish
    Title of host publicationInternational Conference on Recent Advances in Natural Language Processing in a Deep Learning World, RANLP 2019 - Proceedings
    EditorsGalia Angelova, Ruslan Mitkov, Ivelina Nikolova, Irina Temnikova, Irina Temnikova
    PublisherIncoma Ltd
    Pages62-70
    Number of pages9
    ISBN (Electronic)9789544520557
    DOIs
    Publication statusPublished - 2019
    Event12th International Conference on Recent Advances in Natural Language Processing, RANLP 2019 - Varna, Bulgaria
    Duration: 2 Sep 20194 Sep 2019

    Publication series

    NameInternational Conference Recent Advances in Natural Language Processing, RANLP
    Volume2019-September
    ISSN (Print)1313-8502

    Conference

    Conference12th International Conference on Recent Advances in Natural Language Processing, RANLP 2019
    Country/TerritoryBulgaria
    CityVarna
    Period2/09/194/09/19

    Fingerprint

    Dive into the research topics of 'Combining lexical substitutes in neural word sense induction'. Together they form a unique fingerprint.

    Cite this