A probabilistic mixture model for mining and analyzing product search log

Authors:
Huizhong Duan;ChengXiang Zhai;Jinxing Cheng;Abhishek Gattani
Affiliations:
University of Illinois at Urbana-Champaign, Urbana, IL, USA;University of Illinois at Urbana-Champaign, Urbana, IL, USA;WalmartLabs, San Bruno, CA, USA;WalmartLabs, San Bruno, CA, USA
Venue:
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Year:
2013

Citing 23
Cited 0

Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Analysis of a very large web search engine query log

ACM SIGIR Forum
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
FreeSpan: frequent pattern-projected sequential pattern mining

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Cumulated gain-based evaluation of IR techniques

ACM Transactions on Information Systems (TOIS)
Discovery of Multiple-Level Association Rules from Large Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Mining and summarizing customer reviews

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Optimizing web search using web click-through data

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Mining search engine query logs for query recommendation

Proceedings of the 15th international conference on World Wide Web
Mining long-term search history to improve search accuracy

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Show me the money!: deriving the pricing power of product features by mining consumer reviews

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Entropy of search logs: how hard is search? with personalization? with backoff?

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Opinion Mining and Sentiment Analysis

Foundations and Trends in Information Retrieval
Mining term association patterns from search logs for effective query reformulation

Proceedings of the 17th ACM conference on Information and knowledge management
Extracting structured information from user queries with semi-supervised conditional random fields

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Generating comparative summaries of contradictory opinions in text

Proceedings of the 18th ACM conference on Information and knowledge management
Mining Query Logs: Turning Search Usage Data into Knowledge

Foundations and Trends in Information Retrieval
Structured annotations of web queries

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Latent aspect rating analysis on review text data: a rating regression approach

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Structure-aware review mining and summarization

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Multidimensional mining of large-scale search logs: a topic-concept cube approach

Proceedings of the fourth ACM international conference on Web search and data mining
Facet discovery for structured web search: a query-log mining approach

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Mining query subtopics from search log data

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

The booming of e-commerce in recent years has led to the generation of large amounts of product search log data. Product search log is a unique new data with much valuable information and knowledge about user preferences over product attributes that is often hard to obtain from other sources. While regular search logs (e.g., Web search logs) contain click-throughs for unstructured text documents (e.g., web pages), product search logs contain clickth-roughs for structured entities defined by a set of attributes and their values. For instance, a laptop can be defined by its size, color, cpu, ram, etc. Such structures in product entities offer us opportunities to mine and discover detailed useful knowledge about user preferences at the attribute level, but they also raise significant challenges for mining due to the lack of attribute-level observations. In this paper, we propose a novel probabilistic mixture model for attribute-level analysis of product search logs. The model is based on a generative process where queries are generated by a mixture of unigram language models defined by each attribute-value pair of a clicked entity. The model can be efficiently estimated using the Expectation-Maximization (EM) algorithm. The estimated parameters, including the attribute-value language models and attribute-value preference models, can be directly used to improve product search accuracy, or aggregated to reveal knowledge for understanding user intent and supporting business intelligence. Evaluation of the proposed model on a commercial product search log shows that the model is effective for mining and analyzing product search logs to discover various kinds of useful knowledge.