Query refinement based on topical term clustering

Authors:
Hiromi Wakaki;Tomonari Masada;Atsuhiro Takasu;Jun Adachi
Affiliations:
The University of Tokyo, Tokyo, Japan;The National Institute of Informatics, Tokyo, Japan;The National Institute of Informatics, Tokyo, Japan;The National Institute of Informatics, Tokyo, Japan
Venue:
Large Scale Semantic Access to Content (Text, Image, Video, and Sound)
Year:
2007

Citing 12
Cited 0

On term selection for query expansion

Journal of Documentation
Scatter/Gather: a cluster-based approach to browsing large document collections

SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Distributional clustering of words for text classification

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Deriving concept hierarchies from text

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A divisive information theoretic feature clustering algorithm for text classification

The Journal of Machine Learning Research
Findex: search result categories help users when document ranking fails

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
On a combination of probabilistic and boolean ir models for WWW document retrieval

ACM Transactions on Asian Language Information Processing (TALIP)
Clustering versus faceted categories for information exploration

Communications of the ACM - Supporting exploratory search
Multinomial approach and multiple-bernoulli approach for information retrieval based on language modeling

FSKD'05 Proceedings of the Second international conference on Fuzzy Systems and Knowledge Discovery - Volume Part I
A new measure for query disambiguation using term co-occurrences

IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a method for supporting query refinement using topical term clusters. First, we propose a new term weighting method that can extract terms strongly related to a specific topic, because a document set retrieved with an ambiguous query may include divergent topics. Our formulation of term weighting is based on the statistics of term co-occurrence. Then, we generate term clusters using extracted terms, and rerank the documents in the search results by using each term cluster as a query. This clustering procedure is intended to isolate each topic as a set of related terms. In our experiments, we evaluated our term weighting method by checking: 1) whether each of the top-ranked document sets corresponds to one topic; and 2) whether some of the top-ranked document sets cover all the topics included in the synthesized document set. The results of our experiment show our method outperforms the existing term weighting methods MI, KLD, CHI-square and RSV.