TAXI at SemEval-2016 task 13: A taxonomy induction method based on lexico-syntactic patterns, substrings and focused crawling

Alexander Panchenko, Stefano Faralli, Eugen Ruppert, Steffen Remus, Hubert Naets, Cédrick Fairon, Simone Paolo Ponzetto, Chris Biemann

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

54 Citations (SciVal)

Abstract

We present a system for taxonomy construction that reached the first place in all subtasks of the SemEval 2016 challenge on Taxonomy Extraction Evaluation. Our simple yet effective approach harvests hypernyms with substring inclusion and Hearst-style lexicosyntactic patterns from domain-specific texts obtained via language model based focused crawling. Extracted taxonomies are evaluated on English, Dutch, French and Italian for three domains each (Food, Environment and Science). Evaluations against a gold standard and by human judgment show that our method outperforms more complex and knowledge-rich approaches on most domains and languages. Furthermore, to adapt the method to a new domain or language, only a small amount of manual labour is needed.

Original languageEnglish
Title of host publicationSemEval 2016 - 10th International Workshop on Semantic Evaluation, Proceedings
PublisherAssociation for Computational Linguistics (ACL)
Pages1320-1327
Number of pages8
ISBN (Electronic)9781941643952
Publication statusPublished - 2016
Externally publishedYes
Event10th International Workshop on Semantic Evaluation, SemEval 2016 - San Diego, United States
Duration: 16 Jun 201617 Jun 2016

Publication series

NameSemEval 2016 - 10th International Workshop on Semantic Evaluation, Proceedings

Conference

Conference10th International Workshop on Semantic Evaluation, SemEval 2016
Country/TerritoryUnited States
CitySan Diego
Period16/06/1617/06/16

Fingerprint

Dive into the research topics of 'TAXI at SemEval-2016 task 13: A taxonomy induction method based on lexico-syntactic patterns, substrings and focused crawling'. Together they form a unique fingerprint.

Cite this