Semantic deep web: automatic attribute extraction from the deep web data sources

Authors:
Yoo Jung An;James Geller;Yi-Ta Wu;Soon Ae Chun
Affiliations:
New Jersey Institute of Technology, Newark, NJ;New Jersey Institute of Technology, Newark, NJ;University of Michigan, Ann Arbor, MI;CUNY, College of Staten Island, Staten Island, NY
Venue:
Proceedings of the 2007 ACM symposium on Applied computing
Year:
2007

Citing 9
Cited 7

WordNet: a lexical database for English

Communications of the ACM
Deep Web Structure

IEEE Internet Computing
Crawling the Hidden Web

Proceedings of the 27th International Conference on Very Large Data Bases
Extracting structured data from Web pages

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Databases Deepen the Web

Computer
Semantic similarity methods in wordNet and their application to information retrieval on the web

Proceedings of the 7th annual ACM international workshop on Web information and data management
Query Routing: Finding Ways in the Maze of the DeepWeb

WIRI '05 Proceedings of the International Workshop on Challenges in Web Information Retrieval and Integration
OntoMiner: Bootstrapping and Populating Ontologies from Domain-Specific Web Sites

IEEE Intelligent Systems
Bootstrapping domain ontology for semantic web services from source web sites

TES'05 Proceedings of the 6th international conference on Technologies for E-Services

Extracting lists of data records from semi-structured web pages

Data & Knowledge Engineering
Enriching Ontology for Deep Web Search

DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
Finding and Extracting Data Records from Web Pages

Journal of Signal Processing Systems
A methodology to learn ontological attributes from the Web

Data & Knowledge Engineering
Improving web search results for homonyms by suggesting completions from an ontology

ICWE'10 Proceedings of the 10th international conference on Current trends in web engineering
A prediction model for web search hit counts using word frequencies

Journal of Information Science
Generation and exploitation of semantic information using an epidemiological relational database as a primary source of information

MICAI'12 Proceedings of the 11th Mexican international conference on Advances in Artificial Intelligence - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

"Deep Web" refers to the rich information and data hidden in backend databases, etc., that search engines or Web crawlers cannot access. It is mostly accessible through manual query interfaces. This paper introduces the Semantic Deep Web, utilizing an ontology to determine relevance of query interface attributes to access the Deep Web. In addition, we present a novel approach to automatically extracting attributes from query interfaces in order to address the current limitations in accessing Deep Web data sources. Our Automatic Attribute Extraction method (1) identifies attributes that are used by query Web page designers, called Programmer Viewpoint Attributes, and (2) attributes that are presented as labels to users, called User Viewpoint Attributes. An ontology enriches the candidate query attributes by providing synonyms and by supporting the attributes used by designers and users. Our experimental results in several e-commerce domains show that the attributes obtained by our algorithm compare favorably with manually determined attributes to be used for Deep Web queries.