An algorithm for string matching with a sequence of don't cares
Information Processing Letters
Fast subsequence matching in time-series databases
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Fast text searching for regular expressions or automaton searching on tries
Journal of the ACM (JACM)
A Space-Economical Suffix Tree Construction Algorithm
Journal of the ACM (JACM)
Byte-aligned bitmap compression
DCC '95 Proceedings of the Conference on Data Compression
A Fast Regular Expression Indexing Engine
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
RE-tree: an efficient index structure for regular expressions
The VLDB Journal — The International Journal on Very Large Data Bases
Data & Knowledge Engineering
File searching using variable length keys
IRE-AIEE-ACM '59 (Western) Papers presented at the the March 3-5, 1959, western joint computer conference
Business Intelligence from Voice of Customer
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Hi-index | 0.00 |
Finding tuples in a database that match a particular subsequence (with gaps) is an important problem for a range of applications. Subsequence search is equivalent to searching for regular expressions of the type.* q1.* q2.* ….* ql.*, where the subsequence is q1q2 …ql. For efficient execution of these queries, there is a need for appropriate index structures that are both efficient and can scale to large problem sizes. This paper presents two index structures for such queries based on trie and bitmap. These indices are disk-resident, hence can be easily used by large databases with limited memory availability. Our indices are applicable to dynamic databases, where tuples can be added or deleted. Both indices are implemented and validated against a naive approach. The results show that the proposed indices are efficient, having low I/O and time overhead.