Natural Language Engineering
The Lixto data extraction project: back and forth between theory and practice
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Answering queries from statistics and probabilistic views
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Working Models for Uncertain Data
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Managing information extraction: state of the art and research directions
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
ULDBs: databases with uncertainty and lineage
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Creating probabilistic databases from information extraction models
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
High-Performance Unsupervised Relation Extraction from Large Corpora
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Optimized stratified sampling for approximate query processing
ACM Transactions on Database Systems (TODS)
Building structured web community portals: a top-down, compositional, and incremental approach
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Declarative information extraction using datalog with embedded extraction predicates
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Efficient data integration: automation, collaboration, and relaxation
Efficient data integration: automation, collaboration, and relaxation
Pay-as-you-go user feedback for dataspace systems
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Fast and Simple Relational Processing of Uncertain Data
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Enriching OWL with instance recognition semantics for automated semantic annotation
ER'07 Proceedings of the 2007 conference on Advances in conceptual modeling: foundations and applications
ASWC'06 Proceedings of the First Asian conference on The Semantic Web
On the provenance of non-answers to queries over extracted data
Proceedings of the VLDB Endowment
Information extraction challenges in managing unstructured data
ACM SIGMOD Record
Efficiently incorporating user feedback into information extraction and integration programs
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
FOCIH: Form-Based Ontology Creation and Information Harvesting
ER '09 Proceedings of the 28th International Conference on Conceptual Modeling
Automatically incorporating new sources in keyword search-based data integration
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Redundancy-driven web data extraction and integration
Procceedings of the 13th International Workshop on the Web and Databases
Automatic rule refinement for information extraction
Proceedings of the VLDB Endowment
Querying probabilistic information extraction
Proceedings of the VLDB Endowment
Self-supervised web search for any-k complete tuples
Proceedings of the 2nd International Workshop on Business intelligencE and the WEB
Hybrid in-database inference for declarative information extraction
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Building a generic debugger for information extraction pipelines
Proceedings of the 20th ACM international conference on Information and knowledge management
Chapter 6: web data extraction for service creation
Search Computing
Theoretical foundations for enabling a web of knowledge
FoIKS'10 Proceedings of the 6th international conference on Foundations of Information and Knowledge Systems
Proactive natural language search engine: tapping into structured data on the web
Proceedings of the 16th International Conference on Extending Database Technology
Provenance-based dictionary refinement in information extraction
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Extraction and integration of partially overlapping web sources
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
Current approaches to develop information extraction (IE) programs have largely focused on producing precise IE results. As such, they suffer from three major limitations. First, it is often difficult to execute partially specified IE programs and obtain meaningful results, thereby producing a long "debug loop". Second, it often takes a long time before we can obtain the first meaningful result (by finishing and running a precise IE program), thereby rendering these approaches impractical for time-sensitive IE applications. Finally, by trying to write precise IE programs we may also waste a significant amount of effort, because an approximate result -- one that can be produced quickly -- may already be satisfactory in many IE settings. To address these limitations, we propose iFlex, an IE approach that relaxes the precise IE requirement to enable best-effort IE. In iFlex, a developer U uses a declarative language to quickly write an initial approximate IE program P with a possible-worlds semantics. Then iFlex evaluates P using an approximate query processor to quickly extract an approximate result. Next, U examines the result, and further refines P if necessary, to obtain increasingly more precise results. To refine P, U can enlist a next-effort assistant, which suggests refinements based on the data and the current version of P. Extensive experiments on real-world domains demonstrate the utility of the iFlex approach.