A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Similarity Learning for Nearest Neighbor Classification
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Overview of CLEF 2008 INFILE pilot track
CLEF'08 Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access
Information filtering evaluation: overview of CLEF 2009 INFILE track
CLEF'09 Proceedings of the 10th cross-language evaluation forum conference on Multilingual information access evaluation: text retrieval experiments
Hi-index | 0.00 |
We propose in this paper a batch algorithm to learn category specific thresholds in a multiclass environment where a document can belong to more than one class. The algorithm uses the k-nearest neighbor algorithm for filtering the 100,000 documents into 50 profiles. The experiments were run on the English corpus. Our experiments gave us a macro precision of 0.256 while the macro recall was 0.295. We had participated in the online task in INFILE 2008 where we had used an online algorithm using the feedbacks from the server. In comparison with INFILE 2008, the macro recall is significantly better in 2009, 0.295 vs 0.260. However the macro precision in 2008 were 0.306. Furthermore, the anticipation in 2009 was 0.43 as compared with 0.307 in 2008. We have also provided a detailed comparison between the batch and online algorithms.