Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Filtered document retrieval with frequency-sorted indexes
Journal of the American Society for Information Science
Optimization of inverted vector searches
SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
On supporting containment queries in relational database management systems
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Vector-space ranking with effective early termination
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Expressive retrieval from XML documents
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
XIRQL: a query language for information retrieval in XML documents
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Querying and ranking XML documents
Journal of the American Society for Information Science and Technology - XML
Accelerating XPath location steps
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Holistic twig joins: optimal XML pattern matching
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Efficient k-NN search on vertically decomposed data
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Minimal probing: supporting expensive predicates for top-k queries
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
The Index-Based XXL Search Engine for Querying XML Data with Relevance Ranking
EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Reducing the Braking Distance of an SQL Query Engine
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Optimizing Multi-Feature Queries for Image Databases
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Estimating the Selectivity of XML Path Expressions for Internet Scale Applications
Proceedings of the 27th International Conference on Very Large Data Bases
Selected papers from the Third International Workshop WebDB 2000 on The World Wide Web and Databases
Query Processing Issues in Image(Multimedia) Databases
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Searching XML documents via XML fragments
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Optimal aggregation algorithms for middleware
Journal of Computer and System Sciences - Special issu on PODS 2001
Querying structured text in an XML database
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
XRANK: ranked keyword search over XML documents
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Evaluating Top-k Queries over Web-Accessible Databases
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Structural Joins: A Primitive for Efficient XML Query Pattern Matching
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Evaluating top-k queries over web-accessible databases
ACM Transactions on Database Systems (TODS)
FleXPath: flexible structure and full-text querying for XML
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
On the integration of structure indexes and inverted lists
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Adaptive Processing of Top-k Queries in XML
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
XPathLearner: an on-line self-tuning Markov histogram for XML path selectivity estimation
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
XSEarch: a semantic search engine for XML
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Optimized query execution in large search engines with global page ordering
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Holistic twig joins on indexed XML documents
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Top-k query evaluation with probabilistic guarantees
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Report on the DB/IR panel at SIGMOD 2005
ACM SIGMOD Record
An algebraic query model for effective and efficient retrieval of XML fragments
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
IO-Top-k: index-access optimized top-k query processing
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Probabilistic information retrieval approach for ranking of database query results
ACM Transactions on Database Systems (TODS)
The database research group at the Max-Planck Institute for Informatics
ACM SIGMOD Record
Benchmarking multimedia search in structured collections
MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
Preparing heterogeneous XML for full-text search
ACM Transactions on Information Systems (TOIS)
XML search: languages, INEX and scoring
ACM SIGMOD Record
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
iTrails: pay-as-you-go information integration in dataspaces
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Keyword proximity search in complex data graphs
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
A survey of top-k query processing techniques in relational database systems
ACM Computing Surveys (CSUR)
Focused Access to XML Documents
A Comparison of Interactive and Ad-Hoc Relevance Assessments
Focused Access to XML Documents
The INEX 2007 Multimedia Track
Focused Access to XML Documents
Ranking for Approximated XQuery Full-Text Queries
BNCOD '08 Proceedings of the 25th British national conference on Databases: Sharing Data, Information and Knowledge
On Top-k Search with No Random Access Using Small Memory
ADBIS '08 Proceedings of the 12th East European conference on Advances in Databases and Information Systems
SUM '08 Proceedings of the 2nd international conference on Scalable Uncertainty Management
Information filtering and query indexing for an information retrieval model
ACM Transactions on Information Systems (TOIS)
Retrieving meaningful relaxed tightest fragments for XML keyword search
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Feature- and query-based table of contents generation for XML documents
ECIR'07 Proceedings of the 29th European conference on IR research
Efficient text proximity search
SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
Query and update through XML views
DNIS'07 Proceedings of the 5th international conference on Databases in networked information systems
A ranking scheme for XML information retrieval based on benefit and reading effort
ICADL'07 Proceedings of the 10th international conference on Asian digital libraries: looking back 10 years and forging new frontiers
DELOS'07 Proceedings of the 1st international conference on Digital libraries: research and development
Adaptive relaxation for querying heterogeneous XML data sources
Information Systems
Efficient top-k search across heterogeneous XML data sources
DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
WikiAnalytics: disambiguation of keyword search results on highly heterogeneous structured data
Procceedings of the 13th International Workshop on the Web and Databases
Predicate-based indexing for desktop search
The VLDB Journal — The International Journal on Very Large Data Bases
ListBM: a learning-to-rank method for XML keyword search
INEX'09 Proceedings of the Focused retrieval and evaluation, and 8th international conference on Initiative for the evaluation of XML retrieval
INEX'09 Proceedings of the Focused retrieval and evaluation, and 8th international conference on Initiative for the evaluation of XML retrieval
An effective 3-in-1 keyword search method over heterogeneous data sources
Information Systems
Semantic aware RSS query algebra
Proceedings of the 12th International Conference on Information Integration and Web-based Applications & Services
ListOPT: learning to optimize for XML ranking
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
Combining strategies for XML retrieval
INEX'10 Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval
Ranking-based processing of SQL queries
Proceedings of the 20th ACM international conference on Information and knowledge management
Combining incompleteness and ranking in tree queries
ICDT'07 Proceedings of the 11th international conference on Database Theory
INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
Relevance feedback for structural query expansion
INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
Feedback-Driven structural query expansion for ranked retrieval of XML data
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Semantic relevance ranking for XML keyword search
Information Sciences: an International Journal
Searching web data: An entity retrieval and high-performance indexing model
Web Semantics: Science, Services and Agents on the World Wide Web
EDBT'06 Proceedings of the 2006 international conference on Current Trends in Database Technology
Kikori-KS: an effective and efficient keyword search system for digital libraries in XML
ICADL'06 Proceedings of the 9th international conference on Asian Digital Libraries: achievements, Challenges and Opportunities
Structural feedback for keyword-based XML retrieval
ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
Optimizing XML twig queries with full-text predicates
ACM SIGMOD Record
On the effectiveness of flexible querying heuristics for XML data
XSym'07 Proceedings of the 5th international conference on Database and XML Technologies
RSS query algebra: Towards a better news management
Information Sciences: an International Journal
Effective ranking and search techniques for Web resources considering semantic relationships
Information Processing and Management: an International Journal
Hi-index | 0.00 |
This paper presents a novel engine, coined TopX, for efficient ranked retrieval of XML documents over semistructured but nonschematic data collections. The algorithm follows the paradigm of threshold algorithms for top-k query processing with a focus on inexpensive sequential accesses to index lists and only a few judiciously scheduled random accesses. The difficulties in applying the existing top-k algorithms to XML data lie in 1) the need to consider scores for XML elements while aggregating them at the document level, 2) the combination of vague content conditions with XML path conditions, 3) the need to relax query conditions if too few results satisfy all conditions, and 4) the selectivity estimation for both content and structure conditions and their impact on evaluation strategies. TopX addresses these issues by precomputing score and path information in an appropriately designed index structure, by largely avoiding or postponing the evaluation of expensive path conditions so as to preserve the sequential access pattern on index lists, and by selectively scheduling random accesses when they are cost-beneficial. In addition, TopX can compute approximate top-k results using probabilistic score estimators, thus speeding up queries with a small and controllable loss in retrieval precision.