WordNet: a lexical database for English
Communications of the ACM
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Yago: a core of semantic knowledge
Proceedings of the 16th international conference on World Wide Web
Freebase: a collaboratively created graph database for structuring human knowledge
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
WebTables: exploring the power of tables on the web
Proceedings of the VLDB Endowment
RDF123: From Spreadsheets to RDF
ISWC '08 Proceedings of the 7th International Conference on The Semantic Web
The Emerging Web of Linked Data
IEEE Intelligent Systems
DBpedia - A crystallization point for the Web of Data
Web Semantics: Science, Services and Agents on the World Wide Web
XLWrap --- Querying and Integrating Arbitrary Spreadsheets with SPARQL
ISWC '09 Proceedings of the 8th International Semantic Web Conference
TWC data-gov corpus: incrementally generating linked government data from data.gov
Proceedings of the 19th international conference on World wide web
Probabilistic Graphical Models: Principles and Techniques - Adaptive Computation and Machine Learning
Proceedings of the 1st ACM International Health Informatics Symposium
Annotating and searching web tables using entities, types and relationships
Proceedings of the VLDB Endowment
Recovering semantics of tables on the web
Proceedings of the VLDB Endowment
Notes on contemporary table recognition
DAS'06 Proceedings of the 7th international conference on Document Analysis Systems
Semantic extraction of geographic data from web tables for big data integration
Proceedings of the 7th Workshop on Geographic Information Retrieval
Hi-index | 0.00 |
Vast amounts of information is encoded in tables found in documents, on the Web, and in spreadsheets or databases. Integrating or searching over this information benefits from understanding its intended meaning and making it explicit in a semantic representation language like RDF. Most current approaches to generating Semantic Web representations from tables requires human input to create schemas and often results in graphs that do not follow best practices for linked data. Evidence for a table's meaning can be found in its column headers, cell values, implicit relations between columns, caption and surrounding text but also requires general and domain-specific background knowledge. Approaches that work well for one domain, may not necessarily work well for others. We describe a domain independent framework for interpreting the intended meaning of tables and representing it as Linked Data. At the core of the framework are techniques grounded in graphical models and probabilistic reasoning to infer meaning associated with a table. Using background knowledge from resources in the Linked Open Data cloud, we jointly infer the semantics of column headers, table cell values (e.g., strings and numbers) and relations between columns and represent the inferred meaning as graph of RDF triples. A table's meaning is thus captured by mapping columns to classes in an appropriate ontology, linking cell values to literal constants, implied measurements, or entities in the linked data cloud (existing or new) and discovering or and identifying relations between columns.