Representation and learning in information retrieval
Representation and learning in information retrieval
Automated learning of decision rules for text categorization
ACM Transactions on Information Systems (TOIS)
Feature selection, perceptron learning, and a usability case study for text categorization
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Boosting and Rocchio applied to text filtering
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Improved Boosting Algorithms Using Confidence-rated Predictions
Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Hierarchical classification of Web content
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
BoosTexter: A Boosting-based Systemfor Text Categorization
Machine Learning - Special issue on information retrieval
An improved boosting algorithm and its application to text categorization
Proceedings of the ninth international conference on Information and knowledge management
Text classification in a hierarchical mixture model for small training sets
Proceedings of the tenth international conference on Information and knowledge management
Exploiting Hierarchy in Text Categorization
Information Retrieval
Hierarchical Text Categorization Using Neural Networks
Information Retrieval
A Probabilistic Framework for the Hierarchic Organisation and Classification of Document Collections
Journal of Intelligent Information Systems
A Hierarchical Model for Clustering and Categorising Documents
Proceedings of the 24th BCS-IRSG European Colloquium on IR Research: Advances in Information Retrieval
Hierarchically Classifying Documents Using Very Few Words
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Improving Text Classification by Shrinkage in a Hierarchy of Classes
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Hierarchical Text Classification and Evaluation
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
The VLDB Journal — The International Journal on Very Large Data Bases
A scalability analysis of classifiers in text categorization
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
A pitfall and solution in multi-class feature selection for text classification
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Hierarchical document categorization with support vector machines
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Support vector machines classification with a very large-scale taxonomy
ACM SIGKDD Explorations Newsletter - Natural language processing and text mining
Classifying web documents in a hierarchy of categories: a comprehensive study
Journal of Intelligent Information Systems
Automated Classification and Categorization of Mathematical Knowledge
Proceedings of the 9th AISC international conference, the 15th Calculemas symposium, and the 7th international MKM conference on Intelligent Computer Mathematics
A survey of hierarchical classification across different application domains
Data Mining and Knowledge Discovery
An improved K-nearest-neighbor algorithm for text categorization
Expert Systems with Applications: An International Journal
Journal of Data and Information Quality (JDIQ)
Multi-task drug bioactivity classification with graph labeling ensembles
PRIB'11 Proceedings of the 6th IAPR international conference on Pattern recognition in bioinformatics
A Bayesian integration model for improved gene functional inference from heterogeneous data sources
Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Exploiting concept clumping for efficient incremental news article categorization
ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I
A genetic algorithm for Hierarchical Multi-Label Classification
Proceedings of the 27th Annual ACM Symposium on Applied Computing
Exploiting label dependency for hierarchical multi-label classification
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Metadata enrichment services for the europeana digital library
TPDL'12 Proceedings of the Second international conference on Theory and Practice of Digital Libraries
SBIA'12 Proceedings of the 21st Brazilian conference on Advances in Artificial Intelligence
Variable-constraint classification and quantification of radiology reports under the ACR Index
Expert Systems with Applications: An International Journal
A Comparison of Multi-label Feature Selection Methods using the Problem Transformation Approach
Electronic Notes in Theoretical Computer Science (ENTCS)
Learning regular expressions to template-based FAQ retrieval systems
Knowledge-Based Systems
Intelligent Data Analysis
Hi-index | 0.00 |
Hierarchical Text Categorization (HTC) is the task of generating (usually by means of supervised learning algorithms) text classifiers that operate on hierarchically structured classification schemes. Notwithstanding the fact that most large-sized classification schemes for text have a hierarchical structure, so far the attention of text classification researchers has mostly focused on algorithms for "flat" classification, i.e. algorithms that operate on non-hierarchical classification schemes. These algorithms, once applied to a hierarchical classification problem, are not capable of taking advantage of the information inherent in the class hierarchy, and may thus be suboptimal, in terms of efficiency and/or effectiveness. In this paper we propose TreeBoost.MH, a multi-label HTC algorithm consisting of a hierarchical variant of AdaBoost.MH, a very well-known member of the family of "boosting" learning algorithms. TreeBoost.MH embodies several intuitions that had arisen before within HTC: e.g. the intuitions that both feature selection and the selection of negative training examples should be performed "locally", i.e. by paying attention to the topology of the classification scheme. It also embodies the novel intuition that the weight distribution that boosting algorithms update at every boosting round should likewise be updated "locally". All these intuitions are embodied within TreeBoost.MH in an elegant and simple way, i.e. by defining TreeBoost.MH as a recursive algorithm that uses AdaBoost.MH as its base step, and that recurs over the tree structure. We present the results of experimenting TreeBoost.MH on three HTC benchmarks, and discuss analytically its computational cost.