A microeconomic data mining problem: customer-oriented catalog segmentation

Authors:
Martin Ester;Rong Ge;Wen Jin;Zengjian Hu
Affiliations:
Simon Fraser University, Burnaby, B.C., Canada;Simon Fraser University, Burnaby, B.C., Canada;Simon Fraser University, Burnaby, B.C., Canada;Simon Fraser University, Burnaby, B.C., Canada
Venue:
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2004

Citing 10
Cited 9

Segmentation problems

STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
A threshold of ln n for approximating set cover

Journal of the ACM (JACM)
A data mining framework for optimal product selection in retail supermarket data: the generalized PROFSET model

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
A Microeconomic View of Data Mining

Data Mining and Knowledge Discovery
Profit Mining: From Patterns to Actions

EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Value Added Association Rules

PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Item selection by "hub-authority" profit ranking

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Theoretical frameworks for data mining

ACM SIGKDD Explorations Newsletter
MPIS: Maximal-Profit Item Selection with Cross-Selling Considerations

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining

DADA: a data cube for dominant relationship analysis

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Customer-oriented catalog segmentation: effective solution approaches

Decision Support Systems
Constraint-driven clustering

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
On domination game analysis for microeconomic data mining

ACM Transactions on Knowledge Discovery from Data (TKDD)
Credit scoring algorithm based on link analysis ranking with support vector machine

Expert Systems with Applications: An International Journal
Catalog segmentation with double constraints in business

Pattern Recognition Letters
Designing customer-oriented catalogs in e-CRM using an effective self-adaptive genetic algorithm

Expert Systems with Applications: An International Journal
DualRank: a dual-phase algorithm for optimal profit mining in retailing market

ASIAN'05 Proceedings of the 10th Asian Computing Science conference on Advances in computer science: data management on the web
Survey: Some results of Christos Papadimitriou on internet structure, network routing, and web information

Computer Science Review

Quantified Score

Hi-index	0.00

Visualization

Abstract

The microeconomic framework for data mining [7] assumes that an enterprise chooses a decision maximizing the overall utility over all customers where the contribution of a customer is a function of the data available on that customer. In Catalog Segmentation, the enterprise wants to design k product catalogs of size r that maximize the overall number of catalog products purchased. However, there are many applications where a customer, once attracted to an enterprise, would purchase more products beyond the ones contained in the catalog. Therefore, in this paper, we investigate an alternative problem formulation, that we call Customer-Oriented Catalog Segmentation, where the overall utility is measured by the number of customers that have at least a specified minimum interest t in the catalogs. We formally introduce the Customer-Oriented Catalog Segmentation problem and discuss its complexity. Then we investigate two different paradigms to design efficient, approximate algorithms for the Customer-Oriented Catalog Segmentation problem, greedy (deterministic) and randomized algorithms. Since greedy algorithms may be trapped in a local optimum and randomized algorithms crucially depend on a reasonable initial solution, we explore a combination of these two paradigms. Our experimental evaluation on synthetic and real data demonstrates that the new algorithms yield catalogs of significantly higher utility compared to classical Catalog Segmentation algorithms.