Snowball: a prototype system for extracting relations from large text collections
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Extracting Patterns and Relations from the World Wide Web
WebDB '98 Selected papers from the International Workshop on The World Wide Web and Databases
Personal information management with SEMEX
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Principles of dataspace systems
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Building structured web community portals: a top-down, compositional, and incremental approach
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Declarative information extraction using datalog with embedded extraction predicates
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
A relational approach to incrementally extracting and querying structure in unstructured data
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Toward best-effort information extraction
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
On the provenance of non-answers to queries over extracted data
Proceedings of the VLDB Endowment
SystemT: a system for declarative information extraction
ACM SIGMOD Record
Efficient Information Extraction over Evolving Text Data
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Building Community Wikipedias: A Machine-Human Partnership Approach
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Harvesting, searching, and ranking knowledge on the web: invited talk
Proceedings of the Second ACM International Conference on Web Search and Data Mining
Efficiently incorporating user feedback into information extraction and integration programs
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Optimizing complex extraction programs over evolving text data
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
From information to knowledge: harvesting entities and relationships from web sources
Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Enterprise information extraction: recent developments and open challenges
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
SystemT: an algebraic approach to declarative information extraction
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Find your advisor: robust knowledge gathering from the web
Procceedings of the 13th International Workshop on the Web and Databases
ICOODB'10 Proceedings of the Third international conference on Objects and databases
Scalable knowledge harvesting with high precision and high recall
Proceedings of the fourth ACM international conference on Web search and data mining
Database researchers: plumbers or thinkers?
Proceedings of the 14th International Conference on Extending Database Technology
DIDO: a disease-determinants ontology from web sources
Proceedings of the 20th international conference companion on World wide web
Service-oriented information extraction
Proceedings of the 2011 Joint EDBT/ICDT Ph.D. Workshop
Joint unsupervised structure discovery and information extraction
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
SystemT: a declarative information extraction system
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Systems Demonstrations
Keyword search over RDF graphs
Proceedings of the 20th ACM international conference on Information and knowledge management
Robust disambiguation of named entities in text
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Chapter 2: next generation web search
Search Computing
Chapter 3: search for knowledge
Search Computing
YAGO2: A spatially and temporally enhanced knowledge base from Wikipedia
Artificial Intelligence
HIL: a high-level scripting language for entity integration
Proceedings of the 16th International Conference on Extending Database Technology
Information extraction as a filtering task
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
Over the past few years, we have been trying to build an end-to-end system at Wisconsin to manage unstructured data, using extraction, integration, and user interaction. This paper describes the key information extraction (IE) challenges that we have run into, and sketches our solutions. We discuss in particular developing a declarative IE language, optimizing for this language, generating IE provenance, incorporating user feedback into the IE process, developing a novel wiki-based user interface for feedback, best-effort IE, pushing IE into RDBMSs, and more. Our work suggests that IE in managing unstructured data can open up many interesting research challenges, and that these challenges can greatly benefit from the wealth of work on managing structured data that has been carried out by the database community.