A Machine Learning Algorithm for Analyzing String Patterns Helps to Discover Simple and Interpretable Business Rules from Purchase History

  • Authors:
  • Yukinobu Hamuro;Hideki Kawata;Naoki Katoh;Katsutoshi Yada

  • Affiliations:
  • -;-;-;-

  • Venue:
  • Progress in Discovery Science, Final Report of the Japanese Discovery Science Project
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a new application for discovering useful knowledge from purchase history that can be helpful to create effective marketing strategy, using a machine learning algorithm, BONSAI, proposed by Shimozono et al. in 1994 which was originally developed for analyzing string patterns developed for knowledge discovery from amino acid sequences. In order to adapt BONSAI to our purpose, we translate purchase history of customers into character strings such that each symbol represents a brand purchased by a customer. For our purpose, we extend BONSAI in the following aspects; 1) While original BONSAI generates a decision tree over regular patterns which are limited to sub-strings, we extend it to subsequences. 2) We generate rules which contain not only regular patterns but numerical attributes such as age, the number of visits, profit and etc. 3) We extend regular expression so that we can consider whether a certain pattern occurs in some latter part of the whole string. 4) We implement majority voting based on 1-D and 2-D region rules on top of decision trees.Applying the BONSAI extended in this manner to real customers' purchase history of drugstore chain in Japan, we have succeeded in generating interesting business rules which practitioners have not yet recognized.