Identifying the most influential data objects with reverse top-k queries

Authors:
Akrivi Vlachou;Christos Doulkeridis;Kjetil Nørvåg;Yannis Kotidis
Affiliations:
Norwegian University of Science and Technology (NTNU), Trondheim, Norway;Norwegian University of Science and Technology (NTNU), Trondheim, Norway;Norwegian University of Science and Technology (NTNU), Trondheim, Norway;Athens University of Economics and Business (AUEB), Athens, Greece
Venue:
Proceedings of the VLDB Endowment
Year:
2010

Citing 13
Cited 7

Influence sets based on reverse nearest neighbor queries

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
The Skyline Operator

Proceedings of the 17th International Conference on Data Engineering
Evaluating Top-k Selection Queries

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Progressive skyline computation in database systems

ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2003
On computing top-t most influential spatial sites

VLDB '05 Proceedings of the 31st international conference on Very large data bases
DADA: a data cube for dominant relationship analysis

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Branch-and-bound processing of ranked queries

Information Systems
Efficient computation of reverse skyline queries

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Standing Out in a Crowd: Selecting Attributes for Maximum Visibility

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Promotion analysis in multi-dimensional space

Proceedings of the VLDB Endowment
Creating competitive products

Proceedings of the VLDB Endowment
Efficient method for maximizing bichromatic reverse nearest neighbor

Proceedings of the VLDB Endowment
Region-based online promotion analysis

Proceedings of the 13th International Conference on Extending Database Technology

Monitoring reverse top-k queries over mobile devices

Proceedings of the 10th ACM International Workshop on Data Engineering for Wireless and Mobile Access
Distributed top-k query processing by exploiting skyline summaries

Distributed and Parallel Databases
Efficient and domain-invariant competitor mining

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient influence-based processing of market research queries

Proceedings of the 21st ACM international conference on Information and knowledge management
Branch-and-bound algorithm for reverse top-k queries

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Efficient Top-k Keyword Search Over Multidimensional Databases

International Journal of Data Warehousing and Mining
Discovering influential data objects over time

SSTD'13 Proceedings of the 13th international conference on Advances in Spatial and Temporal Databases

Quantified Score

Hi-index	0.00

Visualization

Abstract

Top-k queries are widely applied for retrieving a ranked set of the k most interesting objects based on the individual user preferences. As an example, in online marketplaces, customers (users) typically seek a ranked set of products (objects) that satisfy their needs. Reversing top-k queries leads to a query type that instead returns the set of customers that find a product appealing (it belongs to the top-k result set of their preferences). In this paper, we address the challenging problem of processing queries that identify the top-m most influential products to customers, where influence is defined as the cardinality of the reverse top-k result set. This definition of influence is useful for market analysis, since it is directly related to the number of customers that value a particular product and, consequently, to its visibility and impact in the market. Existing techniques require processing a reverse top-k query for each object in the database, which is prohibitively expensive even for databases of moderate size. In contrast, we propose two algorithms, SB and BB, for identifying the most influential objects: SB restricts the candidate set of objects that need to be examined, while BB is a branch-and-bound algorithm that retrieves the result incrementally. Furthermore, we propose meaningful variations of the query for most influential objects that are supported by our algorithms. Our experiments demonstrate the efficiency of our algorithms both for synthetic and real-life datasets.