A Data Integration Framework for e-Commerce Product Classification
ISWC '02 Proceedings of the First International Semantic Web Conference on The Semantic Web
Text mining for product attribute extraction
ACM SIGKDD Explorations Newsletter
Duplicate Record Detection: A Survey
IEEE Transactions on Knowledge and Data Engineering
Frameworks for entity matching: A comparison
Data & Knowledge Engineering
Learning-Based Approaches for Matching Web Data Entities
IEEE Internet Computing
Evaluation of entity resolution approaches on real-world match problems
Proceedings of the VLDB Endowment
Synthesizing products for online catalogs
Proceedings of the VLDB Endowment
Matching unstructured product offers to structured product specifications
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Modified naïve bayes classifier for e-catalog classification
DEECS'06 Proceedings of the Second international conference on Data Engineering Issues in E-Commerce and Services
Matching product titles using web-based enrichment
Proceedings of the 21st ACM international conference on Information and knowledge management
ProductSeeker: entity-based product retrieval for e-commerce
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
A hybrid model words-driven approach for web product duplicate detection
CAiSE'13 Proceedings of the 25th international conference on Advanced Information Systems Engineering
Hi-index | 0.00 |
Product matching is a challenging variation of entity resolution to identify representations and offers referring to the same product. Product matching is highly difficult due to the broad spectrum of products, many similar but different products, frequently missing or wrong values, and the textual nature of product titles and descriptions. We propose the use of tailored approaches for product matching based on a preprocessing of product offers to extract and clean new attributes usable for matching. In particular, we propose a new approach to extract and use so-called product codes to identify products and distinguish them from similar product variations. We evaluate the effectiveness of the proposed approaches with challenging real-life datasets with product offers from online shops. We also show that the UPC information in product offers is often error-prone and can lead to insufficient match decisions.