Fast Indexing and Visualization of Metric Data Sets using Slim-Trees
IEEE Transactions on Knowledge and Data Engineering
Properties of Embedding Methods for Similarity Searching in Metric Spaces
IEEE Transactions on Pattern Analysis and Machine Intelligence
AMFG '03 Proceedings of the IEEE International Workshop on Analysis and Modeling of Faces and Gestures
Index-driven similarity search in metric spaces (Survey Article)
ACM Transactions on Database Systems (TODS)
A Placement Scheme for Peer-to-Peer Networks Based on Principles from Geometry
P2P '04 Proceedings of the Fourth International Conference on Peer-to-Peer Computing
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
MAMView: a visual tool for exploring and understanding metric access methods
Proceedings of the 2005 ACM symposium on Applied computing
ACM Transactions on Database Systems (TODS)
BoostMap: An Embedding Method for Efficient Nearest Neighbor Retrieval
IEEE Transactions on Pattern Analysis and Machine Intelligence
ACM Transactions on Database Systems (TODS)
Approximate embedding-based subsequence matching of time series
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Nearest neighbor search methods for handshape recognition
Proceedings of the 1st international conference on PErvasive Technologies Related to Assistive Environments
Towards faster activity search using embedding-based subsequence matching
Proceedings of the 2nd International Conference on PErvasive Technologies Related to Assistive Environments
Picture extraction from digitized historical manuscripts
Proceedings of the ACM International Conference on Image and Video Retrieval
Manifold based analysis of facial expression
Image and Vision Computing
Graph classification by means of Lipschitz embedding
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
A database-based framework for gesture recognition
Personal and Ubiquitous Computing
A visual framework to understand similarity queries and explore data in Metric Access Methods
International Journal of Business Intelligence and Data Mining
X-SDR: an extensible experimentation suite for dimensionality reduction
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
BoostMap: a method for efficient approximate similarity rankings
CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Probabilistic expression analysis on manifolds
CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Enhancing Clustering Quality through Landmark-Based Dimensionality Reduction
ACM Transactions on Knowledge Discovery from Data (TKDD)
On nonmetric similarity search problems in complex domains
ACM Computing Surveys (CSUR)
Selecting vantage objects for similarity indexing
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Embedding-based subsequence matching in time-series databases
ACM Transactions on Database Systems (TODS)
Indexing issues in supporting similarity searching
PCM'04 Proceedings of the 5th Pacific Rim Conference on Advances in Multimedia Information Processing - Volume Part II
Hi-index | 0.00 |
Similarity searching in protein sequence databases is a standard technique for biologists dealing with a newly sequenced protein. Exhaustive search in such databases is prohibitive because of the large sizes of these database and because pairwise comparisons are slow. Heuristic techniques, such as FASTA and BLAST, are useful because they are fast and accurate, though it has been shown that exhaustive search is more accurate. Therefore, there are times when one would like to perform an exhaustive search. We propose an efficient method, called SparseMap, for preprocessing a database of proteins to support efficient similarity searches using expensive but sensitive distance functions, such as those based on Smith-Waterman similarity. Our method is based on a Low-dimensional Euclidean Embedding approach. We compare our method with other embedding approaches, and show that our method is faster and produces embeddings which preserve more biological information about the proteins, such as pairwise distance and biological clusters.