The effect multiple query representations on information retrieval system performance
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Fast subsequence matching in time-series databases
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Efficient retrieval for browsing large image databases
CIKM '96 Proceedings of the fifth international conference on Information and knowledge management
Efficiently supporting ad hoc queries in large datasets of time sequences
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
On the analysis of indexing schemes
PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Dimensionality reduction for similarity searching in dynamic databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Fast time-series searching with scaling and shifting
PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Dimensionality reduction and similarity computation by inner product approximations
Proceedings of the ninth international conference on Information and knowledge management
Proceedings of the ninth international conference on Information and knowledge management
A comparison of DFT and DWT based similarity search in time-series databases
Proceedings of the ninth international conference on Information and knowledge management
Locally adaptive dimensionality reduction for indexing large time series databases
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
R-trees: a dynamic index structure for spatial searching
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Efficient Similarity Search In Sequence Databases
FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
Variable Length Queries for Time Series Data
Proceedings of the 17th International Conference on Data Engineering
Fast Time Sequence Indexing for Arbitrary Lp Norms
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Local Dimensionality Reduction: A New Approach to Indexing High Dimensional Spaces
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Interactive Learning with a "Society of Models"
CVPR '96 Proceedings of the 1996 Conference on Computer Vision and Pattern Recognition (CVPR '96)
Supporting Content-Based Searches on Time Series via Approximation
SSDBM '00 Proceedings of the 12th International Conference on Scientific and Statistical Database Management
On Similarity-Based Queries for Time Series Data
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Efficient Time Series Matching by Wavelets
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
IEEE Transactions on Knowledge and Data Engineering
A review on time series data mining
Engineering Applications of Artificial Intelligence
Effective monitoring by efficient fingerprint matching using a forest of NAQ-trees
Journal of Intelligent Information Systems
Hi-index | 0.00 |
The problem of similarity search (query-by-content) has attracted much research interest. It is a difficult problem because of the inherently high dimensionality of the data. The most promising solutions involve performing dimensionality reduction on the data, then indexing the reduced data with a multidimensional index structure. Many dimensionality reduction techniques have been proposed, including Singular Value Decomposition (SVD), the Discrete Fourier Transform (DFT), the Discrete Wavelet Transform (DWT) and Piecewise Polynomial Approximation. In this work, we introduce a novel framework for using ensembles of two or more representations for more efficient indexing. The basic idea is that instead of committing to a single representation for an entire dataset, different representations are chosen for indexing different parts of the database. The representations are chosen based upon a local view of the database. For example, sections of the data that can achieve a high fidelity representation with wavelets are indexed as wavelets, but highly spectral sections of the data are indexed using the Fourier transform. At query time, it is necessary to search several small heterogeneous indices, rather than one large homogeneous index. As we will theoretically and empirically demonstrate this results in much faster query response times.