Learning Boolean Functions in an Infinite Attribute Space
Machine Learning
Learning to resolve natural language ambiguities: a unified approach
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Statistical transliteration for english-arabic cross language information retrieval
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
An English to Korean transliteration model of extended Markov window
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Weakly supervised named entity transliteration and discovery from multilingual comparable corpora
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Named entity discovery using comparable news articles
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
A discriminative framework for bilingual word alignment
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Identification and tracing of ambiguous names: discriminative and generative approaches
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Weakly supervised named entity transliteration and discovery from multilingual comparable corpora
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Weakly-supervised discovery of named entities using web search queries
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
An Unsupervised Learning Algorithm for Rank Aggregation
ECML '07 Proceedings of the 18th European conference on Machine Learning
Mining named entity transliteration equivalents from comparable corpora
Proceedings of the 17th ACM conference on Information and knowledge management
Using English information in non-English web search
Proceedings of the 2nd ACM workshop on Improving non english web searching
Low-Cost Supervision for Multiple-Source Attribute Extraction
CICLing '09 Proceedings of the 10th International Conference on Computational Linguistics and Intelligent Text Processing
Active sample selection for named entity transliteration
HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Feature-based method for document alignment in comparable news corpora
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Transliteration as constrained optimization
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Unsupervised constraint driven learning for transliteration discovery
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Building a semantic lexicon of English nouns via bootstrapping
SRWS '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Student Research Workshop and Doctoral Consortium
Learning better transliterations
Proceedings of the 18th ACM conference on Information and knowledge management
Report of NEWS 2009 machine transliteration shared task
NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration
Transliteration for Resource-Scarce Languages
ACM Transactions on Asian Language Information Processing (TALIP)
Report of NEWS 2010 transliteration generation shared task
NEWS '10 Proceedings of the 2010 Named Entities Workshop
Report of NEWS 2010 transliteration mining shared task
NEWS '10 Proceedings of the 2010 Named Entities Workshop
Enhancing mention detection using projection via aligned corpora
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Machine transliteration survey
ACM Computing Surveys (CSUR)
EM-based hybrid model for bilingual terminology extraction from comparable corpora
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Machine transliteration: leveraging on third languages
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
An algorithm for unsupervised transliteration mining with an application to word alignment
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
From bilingual dictionaries to interlingual document representations
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Improving bilingual projections via sparse covariance matrices
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Toward statistical machine translation without parallel corpora
EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Regularized interlingual projections: evaluation on multilingual transliteration
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Name phylogeny: a generative model of string variation
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Report of NEWS 2012 machine transliteration shared task
NEWS '12 Proceedings of the 4th Named Entity Workshop
Hi-index | 0.00 |
Named Entity recognition (NER) is an important part of many natural language processing tasks. Current approaches often employ machine learning techniques and require supervised data. However, many languages lack such resources. This paper presents an (almost) unsupervised learning algorithm for automatic discovery of Named Entities (NEs) in a resource free language, given a bilingual corpora in which it is weakly temporally aligned with a resource rich language. NEs have similar time distributions across such corpora, and often some of the tokens in a multi-word NE are transliterated. We develop an algorithm that exploits both observations iteratively. The algorithm makes use of a new, frequency based, metric for time distributions and a resource free discriminative approach to transliteration. Seeded with a small number of transliteration pairs, our algorithm discovers multi-word NEs, and takes advantage of a dictionary (if one exists) to account for translated or partially translated NEs. We evaluate the algorithm on an English-Russian corpus, and show high level of NEs discovery in Russian.