Two algorithms for nearest-neighbor search in high dimensions
STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
Approximate nearest neighbors: towards removing the curse of dimensionality
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Combining fuzzy information from multiple systems
Journal of Computer and System Sciences
Rank aggregation methods for the Web
Proceedings of the 10th international conference on World Wide Web
Optimal aggregation algorithms for middleware
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Similarity estimation techniques from rounding algorithms
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Combining fuzzy information: an overview
ACM SIGMOD Record
Efficient Search for Approximate Nearest Neighbor in High Dimensional Spaces
SIAM Journal on Computing
Similarity Indexing with the SS-tree
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Similarity Search in High Dimensions via Hashing
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Contrast Plots and P-Sphere Trees: Space vs. Time in Nearest Neighbour Searches
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
ICALP '97 Proceedings of the 24th International Colloquium on Automata, Languages and Programming
Joining ranked inputs in practice
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Mining anchor text for query refinement
Proceedings of the 13th international conference on World Wide Web
Journal of Systems and Software - Special issue: Performance modeling and analysis of computer systems and networks
Guiding queries to information sources with InfoBeacons
Proceedings of the 5th ACM/IFIP/USENIX international conference on Middleware
Fast Approximate Similarity Search in Extremely High-Dimensional Data Sets
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Comparing and aggregating rankings with ties
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Aggregating inconsistent information: ranking and clustering
Proceedings of the thirty-seventh annual ACM symposium on Theory of computing
Formulating distance functions via the kernel trick
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Formulating context-dependent similarity functions
Proceedings of the 13th annual ACM international conference on Multimedia
Automatic complex schema matching across Web query interfaces: A correlation mining approach
ACM Transactions on Database Systems (TODS)
Proceedings of the 15th international conference on World Wide Web
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Algorithms for discovering bucket orders from data
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Optimizing progressive query-by-example over pre-clustered large image databases
Proceedings of the 2nd international workshop on Computer vision meets databases
A case-study of scoring schemes for the PvS-index
Proceedings of the 2nd international workshop on Computer vision meets databases
Similarity search: a matching based approach
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Adaptive image retrieval using a Graph model for semantic feature integration
MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
Blazingly fast image copyright enforcement
MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Scalability of local image descriptors: a comparative study
MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Journal of Computer and System Sciences
Anyone but him: The complexity of precluding an alternative
Artificial Intelligence
Flexible integration of multimedia sub-queries with qualitative preferences
Multimedia Tools and Applications
Scaling up all pairs similarity search
Proceedings of the 16th international conference on World Wide Web
Proceedings of the 16th international conference on World Wide Web
Rank Aggregation for Automatic Schema Matching
IEEE Transactions on Knowledge and Data Engineering
Finding near neighbors through cluster pruning
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Efficient top-k aggregation of ranked inputs
ACM Transactions on Database Systems (TODS)
Ranking with multiple hyperplanes
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Aggregation of partial rankings, p-ratings and top-m lists
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Efficiency-quality tradeoffs for vector score aggregation
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Merging the results of approximate match operations
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Disorder inequality: a combinatorial approach to nearest neighbor search
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Efficient similarity joins for near duplicate detection
Proceedings of the 17th international conference on World Wide Web
Discovering bucket orders from full rankings
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Semantic representation of multimedia content: Knowledge representation and semantic indexing
Multimedia Tools and Applications
Fast identification of visual documents using local descriptors
Proceedings of the eighth ACM symposium on Document engineering
Aggregating inconsistent information: Ranking and clustering
Journal of the ACM (JACM)
Sincere-Strategy Preference-Based Approval Voting Broadly Resists Control
MFCS '08 Proceedings of the 33rd international symposium on Mathematical Foundations of Computer Science
High-dimensional descriptor indexing for large multimedia databases
Proceedings of the 17th ACM conference on Information and knowledge management
Dynamic user-defined similarity searching in semi-structured text retrieval
Proceedings of the 3rd international conference on Scalable information systems
Finding Total and Partial Orders from Data for Seriation
DS '08 Proceedings of the 11th International Conference on Discovery Science
Fast Content-Based Mining of Web2.0 Videos
PCM '08 Proceedings of the 9th Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
Rank Aggregation to Combine QoS in Web Search
APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
Quality and efficiency in high dimensional nearest neighbor search
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Web searching for daily living
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Anyone but him: the complexity of precluding an alternative
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 1
Hybrid elections broaden complexity-theoretic resistance to control
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Fast Matching for All Pairs Similarity Search
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Combinatorial Framework for Similarity Search
SISAP '09 Proceedings of the 2009 Second International Workshop on Similarity Search and Applications
Advanced Techniques in CBIR: Local Descriptors, Visual Dictionaries and Bags of Features
SIBGRAPI-TUTORIALS '09 Proceedings of the 2009 Tutorials of the XXII Brazilian Symposium on Computer Graphics and Image Processing
An approach to group ranking decisions in a dynamic environment
Decision Support Systems
Pictures from Mongolia: partial sorting in a partial world
FUN'07 Proceedings of the 4th international conference on Fun with algorithms
Generalized distances between rankings
Proceedings of the 19th international conference on World wide web
Parameterized complexity and approximability of the SLCS problem
IWPEC'08 Proceedings of the 3rd international conference on Parameterized and exact computation
Efficient and accurate nearest neighbor and closest pair search in high-dimensional space
ACM Transactions on Database Systems (TODS)
Discovering significant relaxed order-preserving submatrices
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Control complexity in fallback voting
CATS '10 Proceedings of the Sixteenth Symposium on Computing: the Australasian Theory - Volume 109
Nearest neighbor search: algorithmic perspective
SIGSPATIAL Special
Score aggregation techniques in retrieval experimentation
ADC '09 Proceedings of the Twentieth Australasian Conference on Australasian Database - Volume 92
Group ranking with application to image retrieval
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
iPoc: a polar coordinate based indexing method for nearest neighbor search in high dimensional space
WAIM'10 Proceedings of the 11th international conference on Web-age information management
Effective rank aggregation for metasearching
Journal of Systems and Software
Scaling up top-K cosine similarity search
Data & Knowledge Engineering
Multimodal social intelligence in a real-time dashboard system
The VLDB Journal — The International Journal on Very Large Data Bases
Rank-mixer and rank-booster: improving the effectiveness of retrieval methods
ICPR'10 Proceedings of the 20th International conference on Recognizing patterns in signals, speech, images, and videos
Information Sciences: an International Journal
Supporting early pruning in top-k query processing on massive data
Information Processing Letters
Flexible aggregate similarity search
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
NV-Tree: nearest neighbors at the billion scale
Proceedings of the 1st ACM International Conference on Multimedia Retrieval
Efficient similarity joins for near-duplicate detection
ACM Transactions on Database Systems (TODS)
Efficient approximate similarity search using random projection learning
WAIM'11 Proceedings of the 12th international conference on Web-age information management
Adaptive parallel approximate similarity search for responsive multimedia retrieval
Proceedings of the 20th ACM international conference on Information and knowledge management
Context-aware web search in ubiquitous sensor environments
ACM Transactions on Internet Technology (TOIT)
Estimating recall and precision for vague queries in databases
CAiSE'05 Proceedings of the 17th international conference on Advanced Information Systems Engineering
Nearest neighbor search on vertically partitioned high-dimensional data
DaWaK'05 Proceedings of the 7th international conference on Data Warehousing and Knowledge Discovery
High-dimensional similarity search using data-sensitive space partitioning
DEXA'06 Proceedings of the 17th international conference on Database and Expert Systems Applications
A flexible generative model for preference aggregation
Proceedings of the 21st international conference on World Wide Web
Supervised rank aggregation approach for link prediction in complex networks
Proceedings of the 21st international conference companion on World Wide Web
Locality-sensitive hashing scheme based on dynamic collision counting
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Combining summaries using unsupervised rank aggregation
CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part II
CRSI: a compact randomized similarity index for set-valued features
Proceedings of the 15th International Conference on Extending Database Technology
Parameterized complexity and approximability of the Longest Compatible Sequence problem
Discrete Optimization
Conversation retrieval for microblogging sites
Information Retrieval
Studies in computational aspects of voting: open problems of downey and fellows
The Multivariate Algorithmic Revolution and Beyond
An approach to reshaping clusters for nearest neighbor search
IDEAL'12 Proceedings of the 13th international conference on Intelligent Data Engineering and Automated Learning
Synthesis ranking with critic resonance
Proceedings of the 3rd Annual ACM Web Science Conference
Query specific fusion for image retrieval
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part II
Detecting near-duplicate documents using sentence-level features and supervised learning
Expert Systems with Applications: An International Journal
Mining consensus preference graphs from users' ranking data
Decision Support Systems
A mediator-based approach for integrating heterogeneous multimedia sources
Multimedia Tools and Applications
The complexity of losing voters
Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Understanding Similarity Metrics in Neighbour-based Recommender Systems
Proceedings of the 2013 Conference on the Theory of Information Retrieval
CRF framework for supervised preference aggregation
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Locality sensitive hashing revisited: filling the gap between theory and algorithm analysis
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Recommendations of closed consensus temporal patterns by group decision making
Knowledge-Based Systems
Dimension independent similarity computation
The Journal of Machine Learning Research
Hi-index | 0.00 |
We propose a novel approach to performing efficient similarity search and classification in high dimensional data. In this framework, the database elements are vectors in a Euclidean space. Given a query vector in the same space, the goal is to find elements of the database that are similar to the query. In our approach, a small number of independent "voters" rank the database elements based on similarity to the query. These rankings are then combined by a highly efficient aggregation algorithm. Our methodology leads both to techniques for computing approximate nearest neighbors and to a conceptually rich alternative to nearest neighbors.One instantiation of our methodology is as follows. Each voter projects all the vectors (database elements and the query) on a random line (different for each voter), and ranks the database elements based on the proximity of the projections to the projection of the query. The aggregation rule picks the database element that has the best median rank. This combination has several appealing features. On the theoretical side, we prove that with high probability, it produces a result that is a (1 + ε) factor approximation to the Euclidean nearest neighbor. On the practical side, it turns out to be extremely efficient, often exploring no more than 5% of the data to obtain very high-quality results. This method is also database-friendly, in that it accesses data primarily in a pre-defined order without random accesses, and, unlike other methods for approximate nearest neighbors, requires almost no extra storage. Also, we extend our approach to deal with the k nearest neighbors.We conduct two sets of experiments to evaluate the efficacy of our methods. Our experiments include two scenarios where nearest neighbors are typically employed---similarity search and classification problems. In both cases, we study the performance of our methods with respect to several evaluation criteria, and conclude that they are uniformly excellent, both in terms of quality of results and in terms of efficiency.