A maximal figure-of-merit learning approach to text categorization

Authors:
Sheng Gao;Wen Wu;Chin-Hui Lee;Tat-Seng Chua
Affiliations:
Institute for Infocomm Research, Singapore;National Univ. of Singapore, Singapore;National Univ. of Singapore, Singapore and Georgia Institute of Technology, Atlanta, GA;National Univ. of Singapore, Singapore
Venue:
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Year:
2003

Citing 25
Cited 13

Classifying news stories using memory based reasoning

SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
C4.5: programs for machine learning

C4.5: programs for machine learning
Expert network: effective and efficient learning from human decisions in text categorization and retrieval

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
The effect of adding relevance information in a relevance feedback environment

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Multivariate Decision Trees

Machine Learning
The nature of statistical learning theory

The nature of statistical learning theory
A comparison of classifiers and document representations for the routing problem

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Cluster-based text categorization: a comparison of category search strategies

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Support-Vector Networks

Machine Learning
Feature selection, perceptron learning, and a usability case study for text categorization

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
A re-examination of text categorization methods

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A hidden Markov model information retrieval system

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Hierarchical neural networks for text categorization (poster abstract)

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
BoosTexter: A Boosting-based Systemfor Text Categorization

Machine Learning - Special issue on information retrieval
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Automatic Speech and Speaker Recognition: Advanced Topics

Automatic Speech and Speaker Recognition: Advanced Topics
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms

Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A refinement approach to handling model misfit in text categorization

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Linear Machine Decision Trees

Linear Machine Decision Trees
SVDPACKC (Version 1.0) User''s Guide

SVDPACKC (Version 1.0) User''s Guide
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Building semantic perceptron net for topic spotting

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics

A MFoM learning approach to robust multiclass multi-label text categorization

ICML '04 Proceedings of the twenty-first international conference on Machine learning
An analysis of the relative hardness of Reuters-21578 subsets: Research Articles

Journal of the American Society for Information Science and Technology
A maximal figure-of-merit (MFoM)-learning approach to robust classifier design for text categorization

ACM Transactions on Information Systems (TOIS)
Multilabel Neural Networks with Applications to Functional Genomics and Text Categorization

IEEE Transactions on Knowledge and Data Engineering
Training conditional random fields with multivariate evaluation measures

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
ML-KNN: A lazy learning approach to multi-label learning

Pattern Recognition
Ml-rbf: RBF Neural Networks for Multi-Label Learning

Neural Processing Letters
Feature selection for multi-label naive Bayes classification

Information Sciences: an International Journal
Mining Multi-label Concept-Drifting Data Streams Using Dynamic Classifier Ensemble

ACML '09 Proceedings of the 1st Asian Conference on Machine Learning: Advances in Machine Learning
Mining multi-label concept-drifting data streams using ensemble classifiers

FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 5
Design and implementation of contextual information portals

Proceedings of the 20th international conference companion on World wide web
A multiclass/multilabel document categorization system: Combining multiple classifiers in a reduced dimension

Applied Soft Computing
An Efficient Gradient-based Approach to Optimizing Average Precision Through Maximal Figure-of-Merit Learning

Journal of Signal Processing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

A novel maximal figure-of-merit (MFoM) learning approach to text categorization is proposed. Different from the conventional techniques, the proposed MFoM method attempts to integrate any performance metric of interest (e.g. accuracy, recall, precision, or F1 measure) into the design of any classifier. The corresponding classifier parameters are learned by optimizing an overall objective function of interest. To solve this highly nonlinear optimization problem, we use a generalized probabilistic descent algorithm. The MFoM learning framework is evaluated on the Reuters-21578 task with LSI-based feature extraction and a binary tree classifier. Experimental results indicate that the MFoM classifier gives improved F1 and enhanced robustness over the conventional one. It also outperforms the popular SVM method in micro-averaging F1. Other extensions to design discriminative multiple-category MFoM classifiers for application scenarios with new performance metrics could be envisioned too.