Inducing multilingual POS taggers and NP bracketers via robust projection across aligned corpora
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
ODIN: A Model for Adapting and Enriching Legacy Infrastructure
E-SCIENCE '06 Proceedings of the Second IEEE International Conference on e-Science and Grid Computing
A Bayesian Model for Supervised Clustering with the Dirichlet Process Prior
The Journal of Machine Learning Research
Non-parametric Bayesian areal linguistics
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Fast and robust general purpose clustering algorithms
PRICAI'00 Proceedings of the 6th Pacific Rim international conference on Artificial intelligence
Visualising typological relationships: plotting WALS with heat maps
EACL 2012 Proceedings of the EACL 2012 Joint Workshop of LINGVIS & UNCLH
Hi-index | 0.00 |
Recent studies have shown the potential benefits of leveraging resources for resource-rich languages to build tools for similar, but resource-poor languages. We examine what constitutes "similarity" by comparing traditional phylogenetic language groups, which are motivated largely by genetic relationships, with language groupings formed by clustering methods using typological features only. Using data from the World Atlas of Language Structures (WALS), our preliminary experiments show that typologically-based clusters look quite different from genetic groups, but perform as good or better when used to predict feature values of member languages.