Fighting with the sparsity of synonymy dictionaries for automatic synset induction

Dmitry Ustalov, Mikhail Chernoskutov, Chris Biemann, Alexander Panchenko

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Citations (Scopus)

Abstract

Graph-based synset induction methods, such as MaxMax and Watset, induce synsets by performing a global clustering of a synonymy graph. However, such methods are sensitive to the structure of the input synonymy graph: sparseness of the input dictionary can substantially reduce the quality of the extracted synsets. In this paper, we propose two different approaches designed to alleviate the incompleteness of the input dictionaries. The first one performs a pre-processing of the graph by adding missing edges, while the second one performs a post-processing by merging similar synset clusters. We evaluate these approaches on two datasets for the Russian language and discuss their impact on the performance of synset induction methods. Finally, we perform an extensive error analysis of each approach and discuss prominent alternative methods for coping with the problem of sparsity of the synonymy dictionaries.

Original languageEnglish
Title of host publicationAnalysis of Images, Social Networks and Texts - 6th International Conference, AIST 2017, Revised Selected Papers
EditorsAndrey V. Savchenko, Dmitry I. Ignatov, Sergei O. Kuznetsov, Irina A. Lomazova, Victor Lempitsky, Michael Khachay, Natalia Loukachevitch, Amedeo Napoli, Wil M. van der Aalst, Alexander Panchenko, Panos M. Pardalos, Stanley Wasserman
PublisherSpringer Verlag
Pages94-105
Number of pages12
ISBN (Print)9783319730127
DOIs
Publication statusPublished - 2018
Externally publishedYes
Event6th International Conference on Analysis of Images, Social Networks and Texts, AIST 2017 - Moscow, Russian Federation
Duration: 27 Jul 201729 Jul 2017

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10716 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference6th International Conference on Analysis of Images, Social Networks and Texts, AIST 2017
Country/TerritoryRussian Federation
CityMoscow
Period27/07/1729/07/17

Keywords

  • Lexical semantics
  • Sense embeddings
  • Synonyms
  • Synset induction
  • Synset induction
  • Word embeddings
  • Word sense induction

Fingerprint

Dive into the research topics of 'Fighting with the sparsity of synonymy dictionaries for automatic synset induction'. Together they form a unique fingerprint.

Cite this