RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
Support vector machine learning for interdependent and structured output spaces
ICML '04 Proceedings of the twenty-first international conference on Machine learning
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Support vector machines classification with a very large-scale taxonomy
ACM SIGKDD Explorations Newsletter - Natural language processing and text mining
Kernel-Based Learning of Hierarchical Multilabel Classification Models
The Journal of Machine Learning Research
Modeling topic dependencies in hierarchical text categorization
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Hi-index | 0.00 |
The top-down method is efficient and commonly used in hierarchical text classification. Its main drawback is the error propagation from the higher to the lower nodes. To address this issue we propose an efficient incremental reranking model of the top-down classifier decisions. We build a multiclassifier for each hierarchy node, constituted by the latter and its children. Then we generate several classification hypotheses with such classifiers and rerank them to select the best one. Our rerankers exploit category dependencies, which allow them to recover from the multiclassifier errors whereas their application in top-down fashion results in high efficiency. The experimentation on Reuters Corpus Volume 1 (RCV1) shows that our incremental reranking is as accurate as global rerankers but at least one magnitude order faster.