Document classification based on web search hit counts

Authors:
Masaya Kaneko;Shusuke Okamoto;Masaki Kohana;You Inayoshi
Affiliations:
Seikei University, Tokyo, Japan;Seikei University, Tokyo, Japan;Seikei University, Tokyo, Japan;Seikei University, Tokyo, Japan
Venue:
Proceedings of the 14th International Conference on Information Integration and Web-based Applications & Services
Year:
2012

Citing 6
Cited 0

Web document clustering: a feasibility demonstration

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A web-based kernel function for measuring the similarity of short text snippets

Proceedings of the 15th international conference on World Wide Web
Measuring semantic similarity between words using web search engines

Proceedings of the 16th international conference on World Wide Web
The Google Similarity Distance

IEEE Transactions on Knowledge and Data Engineering
Using web-search results to measure word-group similarity

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
A Web Search Engine-Based Approach to Measure Semantic Similarity between Words

IEEE Transactions on Knowledge and Data Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a web mining method to classify research documents automatically. Web hit counts of AND-search on two words are used to form a document vector. Target documents are classified with a result of k-means clustering method, in which cosine similarity is used to calculate a distance.