Constructing literature abstracts by computer: techniques and prospects
Information Processing and Management: an International Journal - Special issue on natural language processing and information retrieval
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss
Machine Learning - Special issue on learning with probabilistic representations
Yahoo! as an ontology: using Yahoo! categories to describe documents
Proceedings of the eighth international conference on Information and knowledge management
Hierarchical classification of Web content
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Hierarchically Classifying Documents Using Very Few Words
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Constructing Web User Profiles: A non-invasive Learning Approach
WEBKDD '99 Revised Papers from the International Workshop on Web Usage Analysis and User Profiling
Term Weighting Approaches in Automatic Text Retrieval
Term Weighting Approaches in Automatic Text Retrieval
Fast webpage classification using URL features
Proceedings of the 14th ACM international conference on Information and knowledge management
Learning hierarchical multi-category text classification models
ICML '05 Proceedings of the 22nd international conference on Machine learning
Hi-index | 0.00 |
One of the solutions of retrieving information from the Internet is by classifying web pages automatically. In almost all classification methods that have been published, feature selection is a very important issue. Although there are many feature selection methods has been proposed. Most of them focus on the features within a category and ignore that the hierarchy of categories also plays an important role in achieving accurate classification results. This paper proposes a new feature selection method that incorporates hierarchical information, which prevents the classifying process from going through every node in the hierarchy. Our test results show that our classification algorithm using hierarchical information reduces the search complexity from n to log(n) and increases the accuracy by 6.2% comparing to a related algorithm.