Eddies: continuously adaptive query processing
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Access path selection in a relational database management system
SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Declarative Data Cleaning: Language, Model, and Algorithms
Proceedings of the 27th International Conference on Very Large Data Bases
Answering queries using views: A survey
The VLDB Journal — The International Journal on Very Large Data Bases
Information extraction for enhanced access to disease outbreak reports
Journal of Biomedical Informatics - Special issue: Sublanguage
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
QXtract: a building block for efficient information extraction from text databases
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Robust query processing through progressive optimization
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Reference reconciliation in complex information spaces
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
To search or to crawl?: towards a query optimizer for text-centric tasks
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Building structured web community portals: a top-down, compositional, and incremental approach
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Contextualizing data warehouses with documents
Decision Support Systems
A modular information extraction system
Intelligent Data Analysis
Novelty and diversity in information retrieval evaluation
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
A survey of top-k query processing techniques in relational database systems
ACM Computing Surveys (CSUR)
Open information extraction from the web
Communications of the ACM - Surviving the data deluge
Search Engines: Information Retrieval in Practice
Search Engines: Information Retrieval in Practice
It takes variety to make a world: diversification in recommender systems
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
The YAGO-NAGA approach to knowledge discovery
ACM SIGMOD Record
Optimizing SQL Queries over Text Databases
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Join Optimization of Information Extraction Output: Quality Matters!
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Exploring a Few Good Tuples from Text Databases
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Quality-driven query answering for integrated information systems
Quality-driven query answering for integrated information systems
SIE-OBI: a streaming information extraction platform for operational business intelligence
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Open information extraction using Wikipedia
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Automatically generating extraction patterns from untagged text
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
FactRank: random walks on a web of facts
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Classification algorithms for relation prediction
ICDEW '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering Workshops
Business Intelligence and the Web
Information Systems Frontiers
INDREX: in-database distributional relation extraction
Proceedings of the sixteenth international workshop on Data warehousing and OLAP
Hi-index | 0.00 |
A common task of Web users is querying structured information from Web pages. For realizing this interesting scenario we propose a novel query processor for systematically discovering instances of semantic relations in Web search results and joining these relation instances into complex result tuples with conjunctive queries. Our query processor transforms a structured user query into keyword queries that are submitted to a search engine, forwards search results to a relation extractor, and then combines relations into complex result tuples. The processor automatically learns discriminative and effective keywords for different types of semantic relations. Thereby, our query processor leverages the index of a search engine to query potentially billions of pages. Unfortunately, relation extractors may fail to return a relation for a result tuple. Moreover, user defined data sources may not return at least k complete result tuples. Therefore we propose an adaptive routing model based on information theory for retrieving missing attributes of incomplete result tuples. The model determines the most promising next incomplete tuple and attribute type for returning any-k complete result tuples at any point during the query execution process. We report a thorough experimental evaluation over multiple relation extractors. Our query processor returns complete result tuples while processing only very few Web pages.