On "one of the few" objects

Authors:
You Wu;Pankaj K. Agarwal;Chengkai Li;Jun Yang;Cong Yu
Affiliations:
Duke University, Durham, NC, USA;Duke University, Durham, NC, USA;The University of Texas at Arlington, Arlington, TX, USA;Duke University, Durham, NC, USA;Google Research, New York City, NY, USA
Venue:
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2012

Citing 16
Cited 0

Bottom-up computation of sparse and Iceberg CUBE

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
On the Average Number of Maxima in a Set of Vectors and Applications

Journal of the ACM (JACM)
Rank aggregation methods for the Web

Proceedings of the 10th international conference on World Wide Web
Efficient Progressive Skyline Computation

Proceedings of the 27th International Conference on Very Large Data Bases
Progressive skyline computation in database systems

ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2003
Maximal vector computation in large data sets

VLDB '05 Proceedings of the 31st international conference on Very large data bases
SUBSKY: Efficient Computation of Skylines in Subspaces

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Mining Multiple Data Sources: Local Pattern Analysis

Data Mining and Knowledge Discovery
Towards multidimensional subspace skyline analysis

ACM Transactions on Database Systems (TODS)
Shooting stars in the sky: an online algorithm for skyline queries

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Efficient processing of top-k dominating queries on multi-dimensional data

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Skyline-based Peer-to-Peer Top-k Query Processing

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Redescription mining: structure theory and algorithms

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
RSD: relational subgroup discovery through first-order feature construction

ILP'02 Proceedings of the 12th international conference on Inductive logic programming
Computational journalism

Communications of the ACM
Prominent streak discovery in sequence data

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Objects with multiple numeric attributes can be compared within any "subspace" (subset of attributes). In applications such as computational journalism, users are interested in claims of the form: Karl Malone is one of the only two players in NBA history with at least 25,000 points, 12,000 rebounds, and 5,000 assists in one's career. One challenge in identifying such "one-of-the-k" claims (k = 2 above) is ensuring their "interestingness". A small k is not a good indicator for interestingness, as one can often make such claims for many objects by increasing the dimensionality of the subspace considered. We propose a uniqueness-based interestingness measure for one-of-the-few claims that is intuitive for non-technical users, and we design algorithms for finding all interesting claims (across all subspaces) from a dataset. Sometimes, users are interested primarily in the objects appearing in these claims. Building on our notion of interesting claims, we propose a scheme for ranking objects and an algorithm for computing the top-ranked objects. Using real-world datasets, we evaluate the efficiency of our algorithms as well as the advantage of our object-ranking scheme over popular methods such as Kemeny optimal rank aggregation and weighted-sum ranking.