New indices for text: PAT Trees and PAT arrays
Information retrieval
Fast text searching for regular expressions or automaton searching on tries
Journal of the ACM (JACM)
PATRICIA—Practical Algorithm To Retrieve Information Coded in Alphanumeric
Journal of the ACM (JACM)
Introduction to the Theory of Computation: Preliminary Edition
Introduction to the Theory of Computation: Preliminary Edition
Querying Semistructured Heterogeneous Information
DOOD '95 Proceedings of the Fourth International Conference on Deductive and Object-Oriented Databases
Optimizing Regular Path Expressions Using Graph Schemas
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Object Exchange Across Heterogeneous Information Sources
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
ICDT '97 Proceedings of the 6th International Conference on Database Theory
DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
A Faster Algorithm for Approximate String Matching
CPM '96 Proceedings of the 7th Annual Symposium on Combinatorial Pattern Matching
Data Mining and Knowledge Discovery
Hi-index | 0.00 |
Information on the Web like HTML documents with images, video, and sound is a collection of heterogeneous data. HTML documents are semistructured in nature. Semistructured data are used to describe those structures which are less rigid or regular than those data found in standard database systems. This study presents a novel means of using Patricia Tree [14] to index semistructured data. This index is used by transferring the query into a regular expression and querying the regular expression over the Patricia Tree. The highlight of this approach is supporting query on content and structure simultaneously, while also supporting fast query time on long path and regular expressions.