Similarity-Based Models of Word Cooccurrence Probabilities
Machine Learning - Special issue on natural language learning
Determinants of adjective-noun plausibility
EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Measures of distributional similarity
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
A method for word sense disambiguation of unrestricted text
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Mining the Web for bilingual text
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
HLT '01 Proceedings of the first international conference on Human language technology research
Scaling to very very large corpora for natural language disambiguation
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Evaluating smoothing algorithms against plausibility judgements
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
A corpus-based account of regular polysemy: the case of context-sensitive adjectives
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Exploring automatic word sense disambiguation with decision lists and the web
Proceedings of the COLING-2000 Workshop on Semantic Annotation and Intelligent Content
Using the web to obtain frequencies for unseen bigrams
Computational Linguistics - Special issue on web as corpus
Towards the self-annotating web
Proceedings of the 13th international conference on World Wide Web
ACM SIGKDD Explorations Newsletter
WWW '05 Proceedings of the 14th international conference on World Wide Web
Evaluating and combining approaches to selectional preference acquisition
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Minimally supervised induction of grammatical gender
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
High-precision identification of discourse new and unique noun phrases
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 2
Kiwi: a multilingual usage consultation tool based on internet searching
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 2
A very very large corpus doesn't always yield reliable estimates
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Using the web in machine learning for other-anaphora resolution
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Semi-supervised learning of geographical gazetteers from the internet
HLT-NAACL-GEOREF '03 Proceedings of the HLT-NAACL 2003 workshop on Analysis of geographic references - Volume 1
The distributional inclusion hypotheses and lexical entailment
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Learning non-taxonomic relationships from web documents for domain ontology construction
Data & Knowledge Engineering
Pattern-based automatic taxonomy learning from the Web
AI Communications
Proceedings of the 17th international conference on World Wide Web
AEON - An approach to the automatic evaluation of ontologies
Applied Ontology - Ontological Foundations of Conceptual Modelling
Text data acquisition for domain-specific language models
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Graph-based word clustering using a web search engine
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Statistical measures of the semi-productivity of light verb constructions
MWE '04 Proceedings of the Workshop on Multiword Expressions: Integrating Processing
Automated multiword expression prediction for grammar engineering
MWE '06 Proceedings of the Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties
WAC '06 Proceedings of the 2nd International Workshop on Web as Corpus
Automatically learning qualia structures from the web
DeepLA '05 Proceedings of the ACL-SIGLEX Workshop on Deep Lexical Acquisition
Exploiting internal and external semantics for the clustering of short texts using world knowledge
Proceedings of the 18th ACM conference on Information and knowledge management
The availability of verb-particle constructions in lexical resources: How much is enough?
Computer Speech and Language
Corpora building and processing
HSI'09 Proceedings of the 2nd conference on Human System Interactions
Wiktionary and NLP: improving synonymy networks
People's Web '09 Proceedings of the 2009 Workshop on The People's Web Meets NLP: Collaboratively Constructed Semantic Resources
Processing natural language without natural language processing
CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
Can we correctly estimate the total number of pages in Google for a specific language?
CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
A multi-domain web-based algorithm for POS tagging of unknown words
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Resources for Turkish morphological processing
Language Resources and Evaluation
ONTECTAS: bridging the gap between collaborative tagging systems and structured data
CAiSE'11 Proceedings of the 23rd international conference on Advanced information systems engineering
Automatic evaluation of ontologies (AEON)
ISWC'05 Proceedings of the 4th international conference on The Semantic Web
Detection of incorrect case assignments in paraphrase generation
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Automatic acquisition of gender information for anaphora resolution
AI'05 Proceedings of the 18th Canadian Society conference on Advances in Artificial Intelligence
A case study of using web search statistics: case restoration
CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
Taxonomic semantic indexing for textual case-based reasoning
ICCBR'10 Proceedings of the 18th international conference on Case-Based Reasoning Research and Development
AEON - An approach to the automatic evaluation of ontologies
Applied Ontology - Ontological Foundations of Conceptual Modelling
A comparable corpus based on aligned multilingual ontologies
MM '12 Proceedings of the First Workshop on Multilingual Modeling
Hi-index | 0.00 |
This paper shows that the web can be employed to obtain frequencies for bigrams that are unseen in a given corpus. We describe a method for retrieving counts for adjective-noun, noun-noun, and verb-object bigrams from the web by querying a search engine. We evaluate this method by demonstrating that web frequencies and correlate with frequencies obtained from a carefully edited, balanced corpus. We also perform a task-based evaluation, showing that web frequencies can reliably predict human plausibility judgments.