Agglomerative clustering of a search engine query log
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering user queries of a search engine
Proceedings of the 10th international conference on World Wide Web
Subject categorization of query terms for exploring Web users' search interests
Journal of the American Society for Information Science and Technology
Enriching web taxonomies through subject categorization of query terms from search engine logs
Decision Support Systems - Web retrieval and mining
Towards Automatic Generation of Query Taxonomy: A Hierarchical Query Clustering Approach
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Query taxonomy generation for web search
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Extracting semantic relations from query logs
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Graphs from Search Engine Queries
SOFSEM '07 Proceedings of the 33rd conference on Current Trends in Theory and Practice of Computer Science
Query recommendation using query logs in search engines
EDBT'04 Proceedings of the 2004 international conference on Current Trends in Database Technology
Applications of web query mining
ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
Investigating the Semantic Gap through Query Log Analysis
ISWC '09 Proceedings of the 8th International Semantic Web Conference
Mining large query induced graphs towards a hierarchical query folksonomy
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Mining large distributed log data in near real time
SLAML '11 Managing Large-scale Systems via the Analysis of System Logs and the Application of Machine Learning Techniques
Mining query log graphs towards a query folksonomy
Concurrency and Computation: Practice & Experience
Hi-index | 0.00 |
In this paper we propose a method for the analysis of very large graphs obtained from query logs, using query coverage inspection. The goal is to extract semantic relations between queries and their terms. We take a new approach to successfully and efficiently cluster these large graphs by analyzing clique overlap and a priori induced cliques. The clustering quality is evaluated with an extension of the modularity score. Results obtained with real data show that the identified clusters can be used to infer properties of the queries and interesting semantic relations between them and their terms. The quality of the semantic relations is evaluated both using a tf-idf based score and data from the Open Directory Project. The proposed approach is also able to identify and filter out multitopical URLs, a feature that is interesting in itself.