Object-oriented systems analysis: a model-driven approach
Object-oriented systems analysis: a model-driven approach
Conceptual-model-based data extraction from multiple-record Web pages
Data & Knowledge Engineering
The description logic handbook
Grouping search-engine returned citations for person-name queries
Proceedings of the 6th annual ACM international workshop on Web information and data management
Towards Ontology Generation from Tables
World Wide Web
Adaptive information extraction
ACM Computing Surveys (CSUR)
A composite approach to automating direct and indirect schema mappings
Information Systems
Structured retrieval for question answering
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Foundations and Trends in Databases
Programming with data frames for everyday data items
AFIPS '80 Proceedings of the May 19-22, 1980, national computer conference
Automatic hidden-web table interpretation, conceptualization, and semantic annotation
Data & Knowledge Engineering
Towards Linguistically Grounded Ontologies
ESWC 2009 Heraklion Proceedings of the 6th European Semantic Web Conference on The Semantic Web: Research and Applications
FOCIH: Form-Based Ontology Creation and Information Harvesting
ER '09 Proceedings of the 28th International Conference on Conceptual Modeling
A methodology to learn ontological attributes from the Web
Data & Knowledge Engineering
Extracting person names from diverse and noisy OCR text
AND '10 Proceedings of the fourth workshop on Analytics for noisy unstructured text data
Automatic wrappers for large scale web extraction
Proceedings of the VLDB Endowment
Hi-index | 0.01 |
Building a database of facts extracted from historical documents to enable database-like query and search would reduce the tedium of gleaning facts of interest from historical documents. We propose a solution in which historical documents themselves constitute the stored database. In our solution, we use information-extraction techniques to produce a conceptualized external annotation of facts found in each document, and we superimpose the conceptualization over the document collection. The annotation process populates the conceptualization producing a repository of extracted facts, and a reasoner obtains inferred facts from these extracted facts. Our query interface accepts free-form queries and converts them to formal queries over the extracted and inferred facts. Displayed results include, in addition to standard query results, images of original documents with results highlighted along with reasoning chains for inferred facts grounded in these highlighted facts. Along with giving the implementation status of our proof-of-concept prototype, we present results for extraction accuracy and efficiency and point to current and future work needed to enable a practical solution for the envisioned historical-document database.