Text mining for product attribute extraction

Authors:
Rayid Ghani;Katharina Probst;Yan Liu;Marko Krema;Andrew Fano
Affiliations:
Accenture Technology Labs, Chicago, IL;Accenture Technology Labs, Chicago, IL;Carnegie Mellon University, Pittsburgh, PA;Accenture Technology Labs, Chicago, IL;Accenture Technology Labs, Chicago, IL
Venue:
ACM SIGKDD Explorations Newsletter
Year:
2006

Citing 9
Cited 22

Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Exploiting generative models in discriminative classifiers

Proceedings of the 1998 conference on Advances in neural information processing systems II
Text Classification from Labeled and Unlabeled Documents using EM

Machine Learning - Special issue on information retrieval
Analyzing the effectiveness and applicability of co-training

Proceedings of the ninth international conference on Information and knowledge management
E-Commerce Recommendation Applications

Data Mining and Knowledge Discovery
Transductive Inference for Text Classification using Support Vector Machines

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Opinion observer: analyzing and comparing opinions on the Web

WWW '05 Proceedings of the 14th international conference on World Wide Web
Extracting product features and opinions from reviews

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing

Show me the money!: deriving the pricing power of product features by mining consumer reviews

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
CRO: a system for online review structurization

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Extracting and Using Attribute-Value Pairs from Product Descriptions on the Web

From Web to Social Web: Discovering and Deploying User and Content Profiles
Opinion Mining and Sentiment Analysis

Foundations and Trends in Information Retrieval
Information Extraction

Foundations and Trends in Databases
Report on the second KDD workshop on data mining for advertising

ACM SIGKDD Explorations Newsletter
Attribute-value specification in customs fraud detection: a human-aided approach

Proceedings of the 10th Annual International Conference on Digital Government Research: Social Networks: Making Connections between Citizens, Data and Government
Opinion Target Network: A Two-Layer Directed Graph for Opinion Target Extraction

TSD '09 Proceedings of the 12th International Conference on Text, Speech and Dialogue
Opinion Target Network and Bootstrapping Method for Chinese Opinion Target Extraction

AIRS '09 Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology
Detecting opinion sentences specific to product features in customer reviews using typed dependency relations

eETTs '09 Proceedings of the Workshop on Events in Emerging Text Types
Efficient confident search in large review corpora

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
Link-based hidden attribute discovery for objects on Web

Proceedings of the 14th International Conference on Extending Database Technology
Deriving the Pricing Power of Product Features by Mining Consumer Reviews

Management Science
Toward a fair review-management system

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II
Bootstrapped named entity recognition for product attribute extraction

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Tailoring entity resolution for matching product offers

Proceedings of the 15th International Conference on Extending Database Technology
Structuring e-commerce inventory

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Building a lightweight semantic model for unsupervised information extraction on short listings

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Finding additional semantic entity information for search engines

Proceedings of the Seventeenth Australasian Document Computing Symposium
An automated multiscale map of conversations: mothers and matters

SocInfo'12 Proceedings of the 4th international conference on Social Informatics
Combining user preferences and user opinions for accurate recommendation

Electronic Commerce Research and Applications
Clustering and classification of maintenance logs using text data mining

AusDM '08 Proceedings of the 7th Australasian Data Mining Conference - Volume 87

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe our work on extracting attribute and value pairs from textual product descriptions. The goal is to augment databases of products by representing each product as a set of attribute-value pairs. Such a representation is beneficial for tasks where treating the product as a set of attribute-value pairs is more useful than as an atomic entity. Examples of such applications include demand forecasting, assortment optimization, product recommendations, and assortment comparison across retailers and manufacturers. We deal with both implicit and explicit attributes and formulate both kinds of extractions as classification problems. Using single-view and multi-view semi-supervised learning algorithms, we are able to exploit large amounts of unlabeled data present in this domain while reducing the need for initial labeled data that is expensive to obtain. We present promising results on apparel and sporting goods products and show that our system can accurately extract attribute-value pairs from product descriptions. We describe a variety of application that are built on top of the results obtained by the attribute extraction system.