Machine Learning
A family of additive online algorithms for category ranking
The Journal of Machine Learning Research
Entity-based cross-document coreferencing using the Vector Space Model
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Information Extraction Tools: Deciphering Human Language
IT Professional
Identifying anaphoric and non-anaphoric noun phrases to improve coreference resolution
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Person resolution in person search results: WebHawk
Proceedings of the 14th ACM international conference on Information and knowledge management
Unsupervised personal name disambiguation
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Online Passive-Aggressive Algorithms
The Journal of Machine Learning Research
On updates that constrain the features' connections during learning
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Open information extraction from the web
Communications of the ACM - Surviving the data deluge
Disambiguating authors in academic publications using random forests
Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries
Who is who and what is what: experiments in cross-document co-reference
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
The SemEval-2007 WePS evaluation: establishing a benchmark for the web people search task
SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
Named entity disambiguation by leveraging wikipedia semantic knowledge
Proceedings of the 18th ACM conference on Information and knowledge management
Profile based cross-document coreference using kernelized fuzzy relational clustering
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Efficient name disambiguation for large-scale databases
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Hi-index | 0.00 |
Cross Document Coreference (CDC) is the task of constructing the coreference chain for mentions of a person across a set of documents. This work offers a holistic view of using document-level categories, sub-document level context and extracted entities and relations for the CDC task. We train a categorization component with an efficient flat algorithm using thousands of ODP categories and over a million web documents. We propose to use ranked categories as coreference information, particularly suitable for web documents that are widely different in style and content. An ensemble composite coreference function, amenable to inactive features, combines these three levels of evidence for disambiguation. A thorough feature importance study is conducted to analyze how these three components contribute to the coreference results. The overall solution is evaluated using the WePS benchmark data and demonstrate superior performance.