A Winnow-Based Approach to Context-Sensitive Spelling Correction
Machine Learning - Special issue on natural language learning
Theory of Syntactic Recognition for Natural Languages
Theory of Syntactic Recognition for Natural Languages
Optimizing search engines using clickthrough data
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Introduction to the special issue on the web as corpus
Computational Linguistics - Special issue on web as corpus
Using the web to obtain frequencies for unseen bigrams
Computational Linguistics - Special issue on web as corpus
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
TnT: a statistical part-of-speech tagger
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Corpus statistics meet the noun compound: some empirical results
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Scaling to very very large corpora for natural language disambiguation
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
The order of prenominal adjectives in natural language generation
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Web-based models for natural language processing
ACM Transactions on Speech and Language Processing (TSLP)
Using the web in machine learning for other-anaphora resolution
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Similarity of Semantic Relations
Computational Linguistics
Improving pronoun resolution using statistics-based semantic compatibility information
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Reranking and self-training for parser adaptation
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
LIBLINEAR: A Library for Large Linear Classification
The Journal of Machine Learning Research
Class-based ordering of prenominal modifiers
ENLG '09 Proceedings of the 12th European Workshop on Natural Language Generation
Adapting a lexicalized-grammar parser to contrasting domains
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
The linguistic structure of English web-search queries
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Web-scale N-gram models for lexical disambiguation
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Search engine statistics beyond the n-gram: application to noun compound bracketing
CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Developing a robust part-of-speech tagger for biomedical text
PCI'05 Proceedings of the 10th Panhellenic conference on Advances in Informatics
Using large monolingual and bilingual corpora to improve coordination disambiguation
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Exploiting web-derived selectional preference to improve statistical dependency parsing
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
NADA: a robust system for non-referential pronoun detection
DAARC'11 Proceedings of the 8th international conference on Anaphora Processing and Applications
Predicting the semantic orientation of terms in E-HowNet
ROCLING '11 Proceedings of the 23rd Conference on Computational Linguistics and Speech Processing
NUS at the HOO 2012 shared task
Proceedings of the Seventh Workshop on Building Educational Applications Using NLP
Coreference semantics from web features
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Detection of implicit citations for sentiment detection
ACL '12 Proceedings of the Workshop on Detecting Structure in Scholarly Discourse
Detection of semantic errors in Arabic texts
Artificial Intelligence
Unsupervised word sense disambiguation with N-gram features
Artificial Intelligence Review
Hi-index | 0.00 |
In this paper, we systematically assess the value of using web-scale N-gram data in state-of-the-art supervised NLP classifiers. We compare classifiers that include or exclude features for the counts of various N-grams, where the counts are obtained from a web-scale auxiliary corpus. We show that including N-gram count features can advance the state-of-the-art accuracy on standard data sets for adjective ordering, spelling correction, noun compound bracketing, and verb part-of-speech disambiguation. More importantly, when operating on new domains, or when labeled training data is not plentiful, we show that using web-scale N-gram features is essential for achieving robust performance.