Association pattern mining for product specification integration

  • Authors:
  • Jyh-Jong Tsay;Chin-Wen Tsay;Ping-Hong Chen

  • Affiliations:
  • National Chung Cheng University, Department of Computer Science and Information Engineering, Chiayi, Taiwan, ROC.;National Chung Cheng University, Department of Computer Science and Information Engineering, Chiayi, Taiwan, ROC.;National Chung Cheng University, Department of Computer Science and Information Engineering, Chiayi, Taiwan, ROC.

  • Venue:
  • FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 2
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

As there are more and more online stores and shopping sites available on the Web, integration of product and shopping information provided by different sources has become more and more important, and attract attention of recent research in information integration. One of the fundamental problems is to integrate specifications for products of the same type from difference vendors so that they are described in a homogeneous and uniform way. Observe that specifications for products of the same type from different vendors can look quite different. Integration of them is a tedious and error-prone task. In this paper, we formulate product specification integration as the problem of text categorization, and propose an association pattern mining approach that can automatically generate pattern rules for each attribute. Association patterns are mined from n-grams generated from product specifications. However; mining of association patterns from n-grams can be very time inefficient as any substrings of a frequent string is also frequent. We propose substring pruning strategies that are specific to text data to improve the running time. Experiment shows that our approach is very time-efficient, and achieves classification accuracy higher than 0.95 for data sets collected for digital cameras, notebook PCs, and LCDs.