Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Reexamining the cluster hypothesis: scatter/gather on retrieval results
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
PAT-tree-based keyword extraction for Chinese information retrieval
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Web document clustering: a feasibility demonstration
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Foundations of statistical natural language processing
Foundations of statistical natural language processing
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Use Link-Based Clustering to Improve Web Search Results
WISE '01 Proceedings of the Second International Conference on Web Information Systems Engineering (WISE'01) Volume 1 - Volume 1
Accurate methods for the statistics of surprise and coincidence
Computational Linguistics - Special issue on using large corpora: I
Retrieving collocations from text: Xtract
Computational Linguistics - Special issue on using large corpora: I
Automatically extracting and representing collocations for language generation
ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
Proceedings of the 13th international conference on World Wide Web
Learning to cluster web search results
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
A search result clustering method using informatively named entities
Proceedings of the 7th annual ACM international workshop on Web information and data management
Exploiting structured ontology to organize scattered online opinions
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Hi-index | 0.00 |
Organizing Web search results into labeled categories is a difficult but very useful task. The idea is to group the many results that each user query generates into well-labeled categories, so that users can find it much easier to browse these results. In the past, clustering-based methods have been applied to solve the search-result organization problem, but it has been difficult to extract the human-readable descriptions for these clusters. An alternative solution to this problem is to generate a series of labels from search results firstly, and then assign documents to relevant labels to form labeled categories. In this approach, a major task is how to generate the labels for the documents. In this paper, we propose a novel label generation method: Firstly, we extract some phrases as candidates of labels based on the search results, and adopt a binary classifier as our learning model to classify these label candidates into useful or meaningless label category. Then, the candidates in the useful label category form the final results. As our method is applied on the search results which are retrieved from a domain-specified corpus instead of general corpus, there're some special features of the labels for classification. Experimental results show that the accuracy of our system is nearly 10% higher than using the mutual information criterion, which is an unsupervised method for solving this problem, to do the label selection.