Extract interesting skyline points in high dimension

Authors:
Gabriel Pui Cheong Fung;Wei Lu;Jing Yang;Xiaoyong Du;Xiaofang Zhou
Affiliations:
School of ITEE, The University of Queensland, Australia;Key Labs of Data Engineering and Knowledge Engineering, Ministry of Education, China;School of Information, Renmin University of China, China;Key Labs of Data Engineering and Knowledge Engineering, Ministry of Education, China;School of ITEE, The University of Queensland, Australia
Venue:
DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part II
Year:
2010

Citing 14
Cited 1

A framework for expressing and combining preferences

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Database Management Systems

Database Management Systems
The Skyline Operator

Proceedings of the 17th International Conference on Data Engineering
An optimal and progressive algorithm for skyline queries

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Progressive skyline computation in database systems

ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2003
Efficient computation of the skyline cube

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Discovering strong skyline points in high dimensional spaces

Proceedings of the 14th ACM international conference on Information and knowledge management
Finding k-dominant skylines in high dimensional space

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Continuous monitoring of top-k queries over sliding windows

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Shooting stars in the sky: an online algorithm for skyline queries

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Foundations of preferences in database systems

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Ad-hoc top-k query answering for data streams

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Dominant and K Nearest Probabilistic Skylines

DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
On high dimensional skylines

EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology

Finding superior skyline points for multidimensional recommendation applications

World Wide Web

Quantified Score

Hi-index	0.00

Visualization

Abstract

When the dimensionality of dataset increases slightly, the number of skyline points increases dramatically as it is usually unlikely for a point to perform equally good in all dimensions. When the dimensionality is very high, almost all points are skyline points. Extract interesting skyline points in high dimensional space automatically is therefore necessary. From our experiences, in order to decide whether a point is an interesting one or not, we seldom base our decision on only comparing two points pairwisely (as in the situation of skyline identification) but further study how good a point can perform in each dimension. For example, in scholarship assignment problem, the students who are selected for scholarships should never be those who simply perform better than the weakest subjects of some other students (as in the situation of skyline). We should select students whose performance on some subjects are better than a reasonable number of students. In the extreme case, even though a student performs outstanding in just one subject, we may still give her scholarship if she can demonstrate she is extraordinary in that area. In this paper, we formalize this idea and propose a novel concept called k-dominate p-core skyline ($C^k_p$). $C^k_p$ is a subset of skyline. In order to identify $C^k_p$ efficiently, we propose an effective tree structure called Linked Multiple B’-tree (LMB). With LMB, we can identify $C^k_p$ within a few seconds from a dataset containing 100,000 points and 15 dimensions.