A vector space model for automatic indexing
Communications of the ACM
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Computational Linguistics
Multilingual coreference resolution
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Entity-based cross-document coreferencing using the Vector Space Model
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Unsupervised learning of name structure from coreference data
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Accurate unlexicalized parsing
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Unsupervised personal name disambiguation
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Arabic tokenization, part-of-speech tagging and morphological disambiguation in one fell swoop
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
On coreference resolution performance metrics
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Multi-lingual coreference resolution with syntactic features
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Scalable training of L1-regularized log-linear models
Proceedings of the 24th international conference on Machine learning
Introduction to Information Retrieval
Introduction to Information Retrieval
Cross-document cross-lingual coreference retrieval
Proceedings of the 17th ACM conference on Information and knowledge management
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
The NVI clustering evaluation measure
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Who is who and what is what: experiments in cross-document co-reference
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A simple and effective hierarchical phrase reordering model
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Structured generative models for unsupervised named-entity clustering
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Arabic cross-document person name normalization
Semitic '07 Proceedings of the 2007 Workshop on Computational Approaches to Semitic Languages: Common Issues and Resources
Arabic cross-document coreference detection
ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Unsupervised and constrained Dirichlet process mixture models for verb clustering
GEMS '09 Proceedings of the Workshop on Geometrical Models of Natural Language Semantics
Improving the multilingual user experience of Wikipedia using cross-language name search
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Untangling the cross-lingual link structure of Wikipedia
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Evaluation metrics for end-to-end coreference resolution systems
SIGDIAL '10 Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Streaming cross document entity coreference resolution
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Large-scale cross-document coreference using distributed inference and hierarchical models
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Distance Dependent Chinese Restaurant Processes
The Journal of Machine Learning Research
Cross-document transliterated personal name coreference resolution
FSKD'05 Proceedings of the Second international conference on Fuzzy Systems and Knowledge Discovery - Volume Part II
A new metric for probability distributions
IEEE Transactions on Information Theory
Hi-index | 0.00 |
Standard entity clustering systems commonly rely on mention (string) matching, syntactic features, and linguistic resources like English WordNet. When co-referent text mentions appear in different languages, these techniques cannot be easily applied. Consequently, we develop new methods for clustering text mentions across documents and languages simultaneously, producing cross-lingual entity clusters. Our approach extends standard clustering algorithms with cross-lingual mention and context similarity measures. Crucially, we do not assume a pre-existing entity list (knowledge base), so entity characteristics are unknown. On an Arabic-English corpus that contains seven different text genres, our best model yields a 24.3% F1 gain over the baseline.