C4.5: programs for machine learning
C4.5: programs for machine learning
A Semantic Web Primer
Mining tables from large scale HTML texts
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
A Scalable Hybrid Approach for Extracting Head Components from Web Tables
IEEE Transactions on Knowledge and Data Engineering
Extracting Ontology Properties from the Web-Tables
International Journal of Systems and Service-Oriented Engineering
Hi-index | 0.00 |
This study concerns the constructing of domain ontology from web tables in a specific domain. Ontology defines the common terms and their meaning (concepts) within a context. Thus only meaningful tables are our concern. The meaningful table is composed of a head and a body, which are formatted in rows and columns. The head abstracts the meaning expressed in the body. Thus, in order to obtain a table-information-extraction framework, this study extracts, as prerequisite work, the structural semantic, that is, the domain ontology that frames web-table information, from the head. We suggest a method for automatically extracting domain ontology using the structural and semantic characteristics of the web-table head. The construction of domain ontology proceeds through two steps: (a) extracting table schema as pseudoontology from each table from the same domain and (b) constructing domain ontology combining those extracted table schemata. The combination of schemata proceeds through splitting and clustering using (a) statistical information and (b) heuristics based on the structural and semantic characteristics of the web-table head.