C4.5: programs for machine learning
C4.5: programs for machine learning
Geospatial mapping and navigation of the web
Proceedings of the 10th international conference on World Wide Web
IEEE Internet Computing
Measuring Structural Similarity Among Web Documents: Preliminary Results
EP '98/RIDT '98 Proceedings of the 7th International Conference on Electronic Publishing, Held Jointly with the 4th International Conference on Raster Imaging and Digital Typography: Electronic Publishing, Artistic Imaging, and Digital Typography
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Toponym resolution in text (abstract only): "which sheffield is it?"
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Grounding spatial named entities for information extraction and question answering
HLT-NAACL-GEOREF '03 Proceedings of the HLT-NAACL 2003 workshop on Analysis of geographic references - Volume 1
Communications of the ACM - ACM at sixty: a look back in time
A visual tool for ontology alignment to enable geospatial interoperability
Journal of Visual Languages and Computing
Using co-occurrence models for placename disambiguation
International Journal of Geographical Information Science
WebTables: exploring the power of tables on the web
Proceedings of the VLDB Endowment
Triplify: light-weight linked data publication from relational databases
Proceedings of the 18th international conference on World wide web
Spatio-textual spreadsheets: geotagging via spatial coherence
Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
AgreementMaker: efficient matching for large real-world schemas and ontologies
Proceedings of the VLDB Endowment
Moving Phenomenon: Aggregation and Analysis of Geotime-Tagged Contents on the Web
W2GIS '09 Proceedings of the 9th International Symposium on Web and Wireless Geographical Information Systems
Web-scale knowledge extraction from semi-structured tables
Proceedings of the 19th international conference on World wide web
Google fusion tables: web-centered data management and collaboration
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Recovering semantics of tables on the web
Proceedings of the VLDB Endowment
NET – a system for extracting web data from flat and nested data records
WISE'05 Proceedings of the 6th international conference on Web Information Systems Engineering
Semi-automatically mapping structured sources into the semantic web
ESWC'12 Proceedings of the 9th international conference on The Semantic Web: research and applications
Entity discovery and annotation in tables
Proceedings of the 16th International Conference on Extending Database Technology
Building linked ontologies with high precision using subclass mapping discovery
Artificial Intelligence Review
Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
Hi-index | 0.00 |
There are millions of web tables with geographic data that are pertinent for big data integration in a variety of domain applications, such as urban sustainability, transportation networks, policy studies, and public health. These tables, however, are heterogeneous in structure, concepts, and metadata. One of the challenges in semantically extracting geographic data is the need to resolve these heterogeneities so as to uncover a conceptual hierarchy, metadata associated with instances, and geographic information---corresponding respectively to ontologies, elements that we call features, and cell values that can be used to identify geographic coordinates. In this paper, we present an architecture with methods to: (1) extract feature-rich web tables; (2) identify features; (3) construct a schema and instances using RDF; (4) perform geocoding. Preliminary experiments led to high accuracy in table identification and feature naming even when compared to manual evaluation.