Enriching Ontology for Deep Web Search
DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
Ontology-Based Deep Web Data Sources Selection
HAIS '08 Proceedings of the 3rd international workshop on Hybrid Artificial Intelligence Systems
Media Meets Semantic Web --- How the BBC Uses DBpedia and Linked Data to Make Connections
ESWC 2009 Heraklion Proceedings of the 6th European Semantic Web Conference on The Semantic Web: Research and Applications
ONTECTAS: bridging the gap between collaborative tagging systems and structured data
CAiSE'11 Proceedings of the 23rd international conference on Advanced information systems engineering
An unsupervised approach for acquiring ontologies and RDF data from online life science databases
ESWC'10 Proceedings of the 7th international conference on The Semantic Web: research and Applications - Volume Part II
Hi-index | 0.00 |
The term "Deep Web" refers to Web pages that are not accessible to search engines, e.g., because those Web pages are dynamically generated in response to queries through Web forms or Web services. The existing automated Web crawlers cannot index these pages, thus they are hidden from the Web search engines. Our goal is to properly annotate such Deep Web services (i.e. content generation interfaces of hidden Web sources) with semantic indexing by constructing domain-specific ontologies to represent the contents of the Deep Web sources. The fully automatic derivation of ontologies from Web sources without human review is to date a challenging research issue. We present a novel approach to automatically building a large, yet domain-specific, ontology by interweaving sub-taxonomies of WordNet with domainspecific information extracted from Deep Web service pages. Our algorithms extract domain concepts from Deep Web sources which are augmented with concepts and relationships from WordNet to construct ontology fragments. Structurally, these are Directed Acyclic Graphs (DAGs). An iterative process of extracting WordNet concepts and relationships and bridging concept gaps is used to tie together disparate domain concepts and ontology fragments into one ontology. Using eight domains (airfares, jobs, etc.) from a well-known test-bed, our algorithms constructed an ontology of 1692 concepts from Deep Web sources and 4434 concepts from WordNet. This ontology is expressed in the OWL format to support semantic Web searches.