Building a large-scale knowledge base for machine translation
AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
WordNet: a lexical database for English
Communications of the ACM
EuroWordNet: a multilingual database with lexical semantic networks
EuroWordNet: a multilingual database with lexical semantic networks
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
Using corpus statistics and WordNet relations for sense identification
Computational Linguistics - Special issue on word sense disambiguation
Using an ontology to determine English countability
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
A plethora of methods for learning English countability
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Hi-index | 0.00 |
We built a computer science text corpus/search engine called X-Tec. We automatically collected 2.98 million sentences (68.9 million words) from carefully chosen English computer science documents on the Web using 678 hours. We also built an interactive sample sentence query system and an automatic expression diag-nostic system for graduate students. Our computer science text corpus/search engine can be also used for knowledge search and word co-occurrence frequency retrieval.