A course in density estimation
A course in density estimation
Sublinear time algorithms for metric space problems
STOC '99 Proceedings of the thirty-first annual ACM symposium on Theory of computing
Data structures and algorithms for nearest neighbor search in general metric spaces
SODA '93 Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms
Optimal aggregation algorithms for middleware
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Fast approximations for sums of distances, clustering and the Fermat--Weber problem
Computational Geometry: Theory and Applications
Fast probabilistic algorithms for hamiltonian circuits and matchings
STOC '77 Proceedings of the ninth annual ACM symposium on Theory of computing
An ontology model to facilitate knowledge-sharing in multi-agent systems
The Knowledge Engineering Review
Spatially-decaying aggregation over a network: model and algorithms
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Selectivity estimators for multidimensional range queries over real attributes
The VLDB Journal — The International Journal on Very Large Data Bases
An Efficient Approximate Algorithm for the 1-Median Problem in Metric Spaces
SIAM Journal on Optimization
Continuous monitoring of top-k queries over sliding windows
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Extracting redundancy-aware top-k patterns
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Formalizing typicality of objects and context-sensitivity in ontologies
AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Answering top-k queries using views
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Answering top-k queries with multi-dimensional selections: the ranking cube approach
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Spatially-decaying aggregation over a network
Journal of Computer and System Sciences
Ontology with likeliness and typicality of objects in concepts
ER'06 Proceedings of the 25th international conference on Conceptual Modeling
Probabilistic ranked queries in uncertain databases
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
ARCube: supporting ranking aggregate queries in partially materialized data cubes
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Tighter estimation using bottom k sketches
Proceedings of the VLDB Endowment
Sliding-window top-k queries on uncertain streams
Proceedings of the VLDB Endowment
ER '08 Proceedings of the 27th International Conference on Conceptual Modeling
Top-k typicality queries and efficient query answering methods on large databases
The VLDB Journal — The International Journal on Very Large Data Bases
Robust and efficient algorithms for rank join evaluation
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
SAIL: Structure-aware indexing for effective and progressive top-k keyword search over XML documents
Information Sciences: an International Journal
LS-MMRM '09 Proceedings of the First ACM workshop on Large-scale multimedia retrieval and mining
MM '09 Proceedings of the 17th ACM international conference on Multimedia
Using trees to depict a forest
Proceedings of the VLDB Endowment
Splash: ad-hoc querying of data and statistical models
Proceedings of the 13th International Conference on Extending Database Technology
Sliding-window top-k queries on uncertain streams
The VLDB Journal — The International Journal on Very Large Data Bases
Efficient processing of exact top-k queries over disk-resident sorted lists
The VLDB Journal — The International Journal on Very Large Data Bases
Accessible image search for colorblindness
ACM Transactions on Intelligent Systems and Technology (TIST)
Efficient top-k retrieval for user preference queries
Proceedings of the 2011 ACM Symposium on Applied Computing
Answering Typicality Query Based on Automatically Prototype Construction
WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Hi-index | 0.00 |
Finding typical instances is an effective approach to understand and analyze large data sets. In this paper, we apply the idea of typicality analysis from psychology and cognition science to database query answering, and study the novel problem of answering top-k typicality queries. We model typicality in large data sets systematically. To answer questions like "Who are the top-k most typical NBA players?", the measure of simple typicality is developed. To answer questions like "Who are the top-k most typical guards distinguishing guards from other players?", the notion of discriminative typicality is proposed. Computing the exact answer to a top-k typicality query requires quadratic time which is often too costly for online query answering on large databases. We develop a series of approximation methods for various situations. (1) The randomized tournament algorithm has linear complexity though it does not provide a theoretical guarantee on the quality of the answers. (2) The direct local typicality approximation using VP-trees provides an approximation quality guarantee. (3) A VP-tree can be exploited to index a large set of objects. Then, typicality queries can be answered efficiently with quality guarantees by a tournament method based on a Local Typicality Tree data structure. An extensive performance study using two real data sets and a series of synthetic data sets clearly show that top-k typicality queries are meaningful and our methods are practical.