The A-tree: An Index Structure for High-Dimensional Spaces Using Relative Approximation
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Effective Indexing and Filtering for Similarity Search in Large Biosequence Databases
BIBE '03 Proceedings of the 3rd IEEE Symposium on BioInformatics and BioEngineering
Scalable kNN search on vertically stored time series
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Compressing IP forwarding tables: towards entropy bounds and beyond
Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
Hi-index | 0.00 |
In this paper, we present CoMRI, Compressed Multi-ResolutionIndex, our system for fast sequence similaritysearch in DNA sequence databases. We employ VirtualBounding Rectangle (VBR) concept to build a compressed,grid style index structure. An advantage of grid format overtrees is subsequence location information is given by theorder of corresponding VBR in the VBR list. Taking advantageof VBRs, our index structure fits into a reasonablesize of memory easily. Together with a new optimized multi-resolutionsearch algorithm, the query speed is improvedsignificantly. Extensive performance evaluations on HumanChromosome sequence data show that VBRs save 80%-93%index storage size compared to MBRs (Minimum oundingRectangles) and new search algorithm prunes almost allunnecessary VBRs which guarantees efficient disk I/O andCPU cost. According to the results of our experiments, theperformance of CoMRI is at least 100 times faster than MRSwhich is another grid index structure introduced very recently.