Modern Information Retrieval
Theoretical and Empirical Analysis of ReliefF and RReliefF
Machine Learning
CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
Fast webpage classification using URL features
Proceedings of the 14th ACM international conference on Information and knowledge management
The Internet Book: Everything You Need to Know About Computer Networking and How the Internet Works (4th Edition)
Web page classification with heterogeneous data fusion
Proceedings of the 16th international conference on World Wide Web
Automatic web pages categorization with ReliefF and Hidden Naive Bayes
Proceedings of the 2007 ACM symposium on Applied computing
Reducing human interactions in Web directory searches
ACM Transactions on Information Systems (TOIS)
Improving web-query processing through semantic knowledge
Data & Knowledge Engineering
Introduction to Modern Information Retrieval, Third Edition
Introduction to Modern Information Retrieval, Third Edition
A hybrid approach for personalized recommendation of news on the Web
Expert Systems with Applications: An International Journal
Hi-index | 0.01 |
A model of automatically classifying uncertain Web pages using multiple features is presented. Since the traditional tree structure can barely classify an avalanche of new Web pages, the proposed approach partially uses the idea of "bag of words" incorporating the idea of classification fusion to describe and categorize Web pages. The proposed approach extracts features of Web pages from various perspectives, such as consulting a Web directory service, analyzing the text features of Web pages' titles and meta-search keywords, and identifying primary content of Web pages. Through fusing the results from these three dedicated classifiers, Web pages are classified to one or more categories with a bunch of words representing the Web pages. In order to demonstrate the effectiveness of the proposed method, experiments are carried out. In the experiments, the Web pages are classified using the proposed fusion method to four categories. A comparison between the dedicated classifiers and fusion methods is also presented.