Computational Statistics & Data Analysis - Nonlinear methods and data mining
Entity-based cross-document coreferencing using the Vector Space Model
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Unsupervised word sense disambiguation rivaling supervised methods
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Unsupervised personal name disambiguation
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
A regression framework for learning ranking functions using relative relevance judgments
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Learning to link with wikipedia
Proceedings of the 17th ACM conference on Information and knowledge management
Is Hillary Rodham Clinton the president?: disambiguating names across documents
CorefApp '99 Proceedings of the Workshop on Coreference and its Applications
Name discrimination by clustering similar contexts
CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
Unified analysis of streaming news
Proceedings of the 20th international conference on World wide web
A generative entity-mention model for linking entities with knowledge base
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Local and global algorithms for disambiguation to Wikipedia
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Collective entity linking in web text: a graph-based method
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
An entity-topic model for entity linking
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Linking named entities to any database
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Proceedings of the sixth ACM international conference on Web search and data mining
Online matching of web content to closed captions in IntoNow
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Searching for interestingness in Wikipedia and Yahoo!: answers
Proceedings of the 22nd international conference on World Wide Web companion
Re-ranking for joint named-entity recognition and linking
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Penguins in sweaters, or serendipitous entity search on user-generated content
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Exploring re-ranking approaches for joint named-entityrecognition and linking
Proceedings of the sixth workshop on Ph.D. students in information and knowledge management
Says who?: automatic text-based content analysis of television news
Proceedings of the 2013 international workshop on Mining unstructured big data using natural language processing
Entity linking at the tail: sparse signals, unknown entities, and phrase models
Proceedings of the 7th ACM international conference on Web search and data mining
Hi-index | 0.00 |
Ambiguity of entity mentions and concept references is a challenge to mining text beyond surface-level keywords. We describe an effective method of disambiguating surface forms and resolving them to Wikipedia entities and concepts. Our method employs an extensive set of features mined from Wikipedia and other large data sources, and combines the features using a machine learning approach with automatically generated training data. Based on a manually labeled evaluation set containing over 1000 news articles, our resolution model has 85% precision and 87.8% recall. The performance is significantly better than three baselines based on traditional context similarities or sense commonness measurements. Our method can be applied to other languages and scales well to new entities and concepts.