The Tree-to-Tree Correction Problem
Journal of the ACM (JACM)
Comparative analysis of five XML query languages
ACM SIGMOD Record
Expressive retrieval from XML documents
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
XIRQL: a query language for information retrieval in XML documents
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Modern Information Retrieval
DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Selected papers from the Third International Workshop WebDB 2000 on The World Wide Web and Databases
A New Editing based Distance between Unordered Labeled Trees
CPM '93 Proceedings of the 4th Annual Symposium on Combinatorial Pattern Matching
Alignment of Trees - An Alternative to Tree Edit
CPM '94 Proceedings of the 5th Annual Symposium on Combinatorial Pattern Matching
Efficient filtering of XML documents with XPath expressions
The VLDB Journal — The International Journal on Very Large Data Bases
XIRQL: An XML query language based on information retrieval concepts
ACM Transactions on Information Systems (TOIS)
Detecting duplicate objects in XML documents
Proceedings of the 2004 international workshop on Information quality in information systems
Adaptive Processing of Top-k Queries in XML
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Semantic Similarity Search on Semistructured Data with the XXL Search Engine
Information Retrieval
Structure and content scoring for XML
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Report on the DB/IR panel at SIGMOD 2005
ACM SIGMOD Record
Fragment-based approximate retrieval in highly heterogeneous XML collections
Data & Knowledge Engineering
FuzzyXPath: Using Fuzzy Logic an IR Features to Approximately Query XML Documents
IFSA '07 Proceedings of the 12th international Fuzzy Systems Association world congress on Foundations of Fuzzy Logic and Soft Computing
RRSi: indexing XML data for proximity twig queries
Knowledge and Information Systems
Retrieving XML data from heterogeneous sources through vague querying
ACM Transactions on Internet Technology (TOIT)
Top-k Answers to Fuzzy XPath Queries
DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
Dissemination of heterogeneous XML data in publish/subscibe systems
Proceedings of the 18th ACM conference on Information and knowledge management
A fuzzy extension of the XPath query language
Journal of Intelligent Information Systems
CoXML: a cooperative XML query answering system
APWeb/WAIM'07 Proceedings of the joint 9th Asia-Pacific web and 8th international conference on web-age information management conference on Advances in data and web management
Efficient top-k search across heterogeneous XML data sources
DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
On the expressiveness of generalization rules for XPath query relaxation
Proceedings of the Fourteenth International Database Engineering & Applications Symposium
Approximate querying of XML fuzzy data
FQAS'06 Proceedings of the 7th international conference on Flexible Query Answering Systems
Highly heterogeneous XML collections: how to retrieve precise results?
FQAS'06 Proceedings of the 7th international conference on Flexible Query Answering Systems
Vague queries on peer-to-peer XML databases
DEXA'07 Proceedings of the 18th international conference on Database and Expert Systems Applications
Hi-index | 0.00 |
We present a simple query language for XML, which supports hierarchical, Boolean-connected query patterns. The interpretation of a query is founded on cost-based query transformations: The total cost of a sequence of transformations measures the similarity between the query and the data and is used to rank the results. We introduce two polynomial-time algorithms that efficiently find the best n answers to the query: The first algorithm finds all approximate results, sorts them by increasing cost, and prunes the result list after the nthen try. The second algorithm uses a structural summary -the schema- of the database to estimate the best k transformed queries, which in turn are executed against the database. We compare both approaches and show that the schema-based evaluation outperforms the pruning approach for small values of n. The pruning strategy is the better choice if n is close to the total number of approximate results for the query.