Towards a universal wordnet by learning from combined evidence

Authors:
Gerard de Melo;Gerhard Weikum
Affiliations:
Max-Planck-Institut für Informatik, Saarbrücken, Germany;Max-Planck-Institut für Informatik, Saarbrücken, Germany
Venue:
Proceedings of the 18th ACM conference on Information and knowledge management
Year:
2009

Citing 17
Cited 16

Building a large-scale knowledge base for machine translation

AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
Making large-scale support vector machine learning practical

Advances in kernel methods
EuroWordNet: a multilingual database with lexical semantic networks

EuroWordNet: a multilingual database with lexical semantic networks
A systematic comparison of various statistical alignment models

Computational Linguistics
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Combining clues for word alignment

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Mapping WordNets using structural information

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Building Japanese-English dictionary based on ontology for machine translation

HLT '94 Proceedings of the workshop on Human Language Technology
Semantic taxonomy induction from heterogenous evidence

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Recognising textual entailment with logical inference

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Yago: a core of semantic knowledge

Proceedings of the 16th international conference on World Wide Web
A note on Platt's probabilistic outputs for support vector machines

Machine Learning
Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning)

Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning)
Information arbitrage across multi-lingual Wikipedia

Proceedings of the Second ACM International Conference on Web Search and Data Mining
Multilingual text classification using ontologies

ECIR'07 Proceedings of the 29th European conference on IR research
Connecting the universal to the specific: towards the global grid

IWIC'07 Proceedings of the 1st international conference on Intercultural collaboration
Web query expansion by wordnet

DEXA'05 Proceedings of the 16th international conference on Database and Expert Systems Applications

BabelNet: building a very large multilingual semantic network

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
MENTA: inducing multilingual taxonomies from wikipedia

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
YAGO2: exploring and querying world knowledge in time, space, context, and many languages

Proceedings of the 20th international conference companion on World wide web
A modeling method and declarative language for temporal reasoning based on fluid qualities

ICCS'11 Proceedings of the 19th international conference on Conceptual structures for discovering knowledge
Zhishi.me: weaving chinese linking open data

ISWC'11 Proceedings of the 10th international conference on The semantic web - Volume Part II
Automatic taxonomy extraction in different languages using wikipedia and minimal language-specific information

CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Uby: a large-scale unified lexical-semantic resource based on LMF

EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Constructing and utilizing wordnets using statistical methods

Language Resources and Evaluation
UWN: a large multilingual lexical knowledge base

ACL '12 Proceedings of the ACL 2012 System Demonstrations
BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network

Artificial Intelligence
Collaboratively built semi-structured content and Artificial Intelligence: The story so far

Artificial Intelligence
YAGO2: A spatially and temporally enhanced knowledge base from Wikipedia

Artificial Intelligence
Knowledge harvesting in the big-data era

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
A quick tour of babelnet 1.1

CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Inside YAGO2s: a transparent information extraction architecture

Proceedings of the 22nd international conference on World Wide Web companion
YAGO2: a spatially and temporally enhanced knowledge base from wikipedia (extended abstract)

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Lexical databases are invaluable sources of knowledge about words and their meanings, with numerous applications in areas like NLP, IR, and AI. We propose a methodology for the automatic construction of a large-scale multilingual lexical database where words of many languages are hierarchically organized in terms of their meanings and their semantic relations to other words. This resource is bootstrapped from WordNet, a well-known English-language resource. Our approach extends WordNet with around 1.5 million meaning links for 800,000 words in over 200 languages, drawing on evidence extracted from a variety of resources including existing (monolingual) wordnets, (mostly bilingual) translation dictionaries, and parallel corpora. Graph-based scoring functions and statistical learning techniques are used to iteratively integrate this information and build an output graph. Experiments show that this wordnet has a high level of precision and coverage, and that it can be useful in applied tasks such as cross-lingual text classification.