Distributional clustering of words for text classification
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Enhanced word clustering for hierarchical text classification
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Introduction to the special issue on the web as corpus
Computational Linguistics - Special issue on web as corpus
Word clustering and disambiguation based on co-occurrence data
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Distributional clustering of English words
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Towards the self-annotating web
Proceedings of the 13th international conference on World Wide Web
Automatic thesaurus generation through multiple filtering
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
A search engine for natural language applications
WWW '05 Proceedings of the 14th international conference on World Wide Web
Disambiguating Web appearances of people in a social network
WWW '05 Proceedings of the 14th international conference on World Wide Web
A graph model for unsupervised lexical acquisition
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Frequency estimates for statistical word similarity measures
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Ensemble methods for automatic thesaurus extraction
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Using the web to overcome data sparseness
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
POLYPHONET: an advanced social network extraction system from the web
Proceedings of the 15th international conference on World Wide Web
Creating multilingual translation lexicons with regional variations using web corpora
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Flink: Semantic Web technology for the extraction and analysis of social networks
Web Semantics: Science, Services and Agents on the World Wide Web
Categorizing unknown text segments for information extraction using a search result mining approach
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Measuring semantic similarity between words using web search engines
Proceedings of the 16th international conference on World Wide Web
Towards a Novel Association Measure via Web Search Results Mining
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Valuable Change Detection in Keyword Map Animation
Canadian AI '09 Proceedings of the 22nd Canadian Conference on Artificial Intelligence: Advances in Artificial Intelligence
Using hidden Markov random fields to combine distributional and pattern-based word clustering
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Graph-based clustering for semantic classification of onomatopoetic words
TextGraphs-3 Proceedings of the 3rd Textgraphs Workshop on Graph-Based Algorithms for Natural Language Processing
Towards Bridging the Web and the Semantic Web
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Classifying Japanese polysemous verbs based on fuzzy C-means clustering
TextGraphs-4 Proceedings of the 2009 Workshop on Graph-based Methods for Natural Language Processing
Research paper title evaluation for reaching new audiences
AIA '08 Proceedings of the 26th IASTED International Conference on Artificial Intelligence and Applications
Automated skimming in response to questions for nonvisual readers
SLPAT '10 Proceedings of the NAACL HLT 2010 Workshop on Speech and Language Processing for Assistive Technologies
Graph-based clustering for computational linguistics: a survey
TextGraphs-5 Proceedings of the 2010 Workshop on Graph-based Methods for Natural Language Processing
Thesaurus extension using web search engines
ICADL'10 Proceedings of the role of digital libraries in a time of global change, and 12th international conference on Asia-Pacific digital libraries
MorphoNet: exploring the use of community structure for unsupervised morpheme analysis
CLEF'09 Proceedings of the 10th cross-language evaluation forum conference on Multilingual information access evaluation: text retrieval experiments
ICWE'10 Proceedings of the 10th international conference on Current trends in web engineering
Clustering product features for opinion mining
Proceedings of the fourth ACM international conference on Web search and data mining
Polysemous verb classification using subcategorization acquisition and graph-based clustering
LTC'09 Proceedings of the 4th conference on Human language technology: challenges for computer science and linguistics
Induction of Semantic Classes Based on Coordinate Patterns
WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 03
Harnessing different knowledge sources to measure semantic relatedness under a uniform model
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Hierarchical verb clustering using graph factorization
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A framework for semantic discovery of web services
iUBICOM'10 Proceedings of the 5th international conference on Ubiquitous and Collaborative Computing
Hybrid Method for Computing Word-Pair Similarity based on Web Content
Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics
Context similarity measure using Fuzzy Formal Concept Analysis
Proceedings of the Second International Conference on Computational Science, Engineering and Information Technology
Hi-index | 0.00 |
Word clustering is important for automatic thesaurus construction, text classification, and word sense disambiguation. Recently, several studies have reported using the web as a corpus. This paper proposes an unsupervised algorithm for word clustering based on a word similarity measure by web counts. Each pair of words is queried to a search engine, which produces a co-occurrence matrix. By calculating the similarity of words, a word co-occurrence graph is obtained. A new kind of graph clustering algorithm called Newman clustering is applied for efficiently identifying word clusters. Evaluations are made on two sets of word groups derived from a web directory and WordNet.