Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Combining classifiers in text categorization
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Information Retrieval
High-performing feature selection for text classification
Proceedings of the eleventh international conference on Information and knowledge management
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
An Improved Feature Selection using Maximized Signal to Noise Ratio Technique for TC
ITNG '06 Proceedings of the Third International Conference on Information Technology: New Generations
Text similarity: an alternative way to search MEDLINE
Bioinformatics
Distributional Features for Text Categorization
IEEE Transactions on Knowledge and Data Engineering
Proposing a new term weighting scheme for text categorization
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Hi-index | 0.00 |
Profile based methods have been successfully used for the classification of unstructured texts. This paper presents a profile based method for Wikipedia XML document classification. We have used profiles built using negative category information. Our approach exploits the structure of Wikipedia documents to build profiles. Two class profiles are built; one based on the whole content and the other based on the initial description of the Wikipedia documents. In addition, we have also explored the option of using the terms in the section and subsection titles. The effectiveness of cosine and fractional similarity measures in classifying XML documents is compared. The importance of combining two profile based classifiers is experimentally shown to have worked better than individual classifiers.