C4.5: programs for machine learning
C4.5: programs for machine learning
Cluster-based text categorization: a comparison of category search strategies
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
An Evaluation of Statistical Approaches to Text Categorization
Information Retrieval
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Hierarchical Bayesian clustering for automatic text classification
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Rough Set-Aided Feature Selection for Automatic Web-Page Classification
WI '04 Proceedings of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence
The BankSearch web document dataset: investigating unsupervised clustering and category similarity
Journal of Network and Computer Applications - Special issue on computational intelligence on the internet
Bookmark Category Web Page Classification Using Four Indexing and Clustering Approaches
AH '08 Proceedings of the 5th international conference on Adaptive Hypermedia and Adaptive Web-Based Systems
Hypertext classification to filtrate information on the web
Proceedings of the 2009 Euro American Conference on Telematics and Information Systems: New Opportunities to increase Digital Citizenship
MCS'03 Proceedings of the 4th international conference on Multiple classifier systems
Automated user modeling for personalized digital libraries
International Journal of Information Management: The Journal for Information Professionals
Hi-index | 0.00 |
This paper describes automatic Web-page classification by using machine learning methods. Recently, the importance of portal site services is increasing including the search engine function on World Wide Web. Especially, the portal site such as Yahoo! service, which hierarchically classifies Web-pages into many categories, is becoming popular. However, the classification of Web-page into each category relies on man power, which costs much time and care. To alleviate this problem, we propose techniques to generate attributes by using co-occurrence analysis and to classify Web-page automatically based on machine learning. We apply these techniques to Web-pages on Yahoo! JAPAN and construct decision trees, which determine appropriate category for each Web-page. The performance of this proposed method is evaluated in terms of error rate, recall, and precision. The experimental evaluation demonstrates that this method provides acceptable accuracy with the classification of Web-page into top level categories on Yahoo! JAPAN.