Using the web to obtain frequencies for unseen bigrams
Computational Linguistics - Special issue on web as corpus
TnT: a statistical part-of-speech tagger
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Named entity recognition with character-level models
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Using the web in machine learning for other-anaphora resolution
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Language and the Internet
Exploring automatic word sense disambiguation with decision lists and the web
Proceedings of the COLING-2000 Workshop on Semantic Annotation and Intelligent Content
A Method for Recognizing Noisy Romanized Japanese Words in Learner English
IEICE - Transactions on Information and Systems
Text segmentation by language using minimum description length
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Hi-index | 0.00 |
We present an unsupervised system that exploits linguistic knowledge resources, namely English and German lexical databases and the World Wide Web, to identify English inclusions in German text. We describe experiments with this system and the corpus which was developed for this task. We report the classification results of our system and compare them to the performance of a trained machine learner in a series of in- and cross-domain experiments.