Using WordNet to disambiguate word senses for text retrieval
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Generalized vector spaces model in information retrieval
SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
IEEE Intelligent Systems
Text Representation: From Vector to Tensor
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Categorization of wikipedia articles with spectral clustering
IDEAL'11 Proceedings of the 12th international conference on Intelligent data engineering and automated learning
Hi-index | 0.00 |
Wikipedia - the Free Encyclopedia encounters the problem of proper classification of new articles everyday. The process of assignment of articles to categories is performed manually and it is a time consuming task. It requires knowledge about Wikipedia structure, which is beyond typical editor competence, which leads to human-caused mistakes - omitting or wrong assignments of articles to categories. The article presents application of SVM classifier for automatic classification of documents from The Free Encyclopedia. The classifier application has been tested while using two text representations: inter-documents connections (hyperlinks) and word content. The results of the performed experiments evaluated on hand crafted data show that the Wikipedia classification process can be partially automated. The proposed approach can be used for building a decision support system which suggests editors the best categories that fit new content entered to Wikipedia.