Proceedings of the 27th International Conference on Very Large Data Bases
Automatic Classification of Text Databases Through Query Probing
Selected papers from the Third International Workshop WebDB 2000 on The World Wide Web and Databases
Automatic Topic Identification Using Ontology Hierarchy
CICLing '01 Proceedings of the Second International Conference on Computational Linguistics and Intelligent Text Processing
Organizing structured web sources by query schemas: a clustering approach
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Structured databases on the web: observations and implications
ACM SIGMOD Record
Queue - Semi-structured Data
Identifying Document Topics Using the Wikipedia Category Network
WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
Organizing Structured Deep Web by Clustering Query Interfaces Link Graph
ADMA '08 Proceedings of the 4th international conference on Advanced Data Mining and Applications
Research on Automatic Classification for Deep Web Query Interfaces
ISIP '08 Proceedings of the 2008 International Symposiums on Information Processing
Subject-Oriented Classification Based on Scale Probing in the Deep Web
WAIM '08 Proceedings of the 2008 The Ninth International Conference on Web-Age Information Management
Proceedings of the VLDB Endowment
Covering the semantic space of tourism: an approach based on modularized ontologies
Proceedings of the 1st Workshop on Context, Information and Ontologies
International Journal of Human-Computer Studies
Feature generation for text categorization using world knowledge
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Using Hyperlink Texts to Improve Quality of Identifying Document Topics Based on Wikipedia
KSE '09 Proceedings of the 2009 International Conference on Knowledge and Systems Engineering
Automatic hierarchical classification of structured deep web databases
WISE'06 Proceedings of the 7th international conference on Web Information Systems
Hi-index | 0.00 |
Today, deep web comprises of a large part of web contents. Because of this large volume of data, the technologies related to deep web have gained larger attention in recent years. Deep web mostly comprises of online domain specific databases, which are accessed by using web query interfaces. These highly relevant domain specific databases are more suitable for satisfying the information needs of the users. In order to make the extraction of relevant information easier, there is a need to classify the deep web databases into subject-specific self-descriptive categories. In this paper we present a novel training-less classification approach TODWEB based on common sense world knowledge (in the form of ontology or any external lexical resource) for the automatic deep web source classification; which will help in building highly scalable, domain focused and efficient semantic information retrieval systems (i.e. metasearch engine and search engine directories). One of the important aspects of this approach is the classification method which is completely training less and uses Wikipedia category network and domain-independent ontologies to analyze the semantics in the meta-information of the deep web sources. The large number of fine grained Wikipedia categories are employed to analyze semantic relatedness among concepts and finally the URL of deep web search source is mapped to the category hierarchy offered by Wikipedia. The experiments conducted on a collection of search sources shows that this approach results in a highly accurate and fine grained classification as compared to existing approaches, nearly identical to the results achieved by manual classification.