Uniqueness mining

Authors:
Rohit Paravastu;Hanuma Kumar;Vikram Pudi
Affiliations:
IIIT-H, Hyderabad, India;IIIT-H, Hyderabad, India;IIIT-H, Hyderabad, India
Venue:
DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
Year:
2008

Citing 6
Cited 1

CURE: an efficient clustering algorithm for large databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Automatic subspace clustering of high dimensional data for data mining applications

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
LOF: identifying density-based local outliers

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Outlier detection for high dimensional data

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Finding Intensional Knowledge of Distance-Based Outliers

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases

Mining special features to improve the performance of e-commerce product selection and resume processing

International Journal of Computational Science and Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we consider the problem of extracting the special properties of any given record in a dataset. We are interested in determining what makes a given record unique or different from the majority of the records in a dataset. In the real world, records typically represent objects or people and it is often worthwhile to know what special properties are present in each object or person, so that we can make the best use of them. This problem has not been considered earlier in the research literature. We approach this problem using ideas from clustering, attribute oriented induction (AOI) and frequent itemset mining. Most of the time consuming work is done in a preprocessing stage and the online computation of the uniqueness of a given record is instantaneous.