Combining model-oriented and description-oriented approaches for probabilistic indexing
SIGIR '91 Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic retrieval based on staged logistic regression
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Automated learning of decision rules for text categorization
ACM Transactions on Information Systems (TOIS)
An example-based mapping method for text categorization and retrieval
ACM Transactions on Information Systems (TOIS)
A sequential algorithm for training text classifiers
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Inferring probability of relevance using the method of logistic regression
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
The nature of statistical learning theory
The nature of statistical learning theory
A comparison of classifiers and document representations for the routing problem
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Machine Learning
Matrix computations (3rd ed.)
Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
Advances in kernel methods: support vector learning
Advances in kernel methods: support vector learning
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
An Evaluation of Statistical Approaches to Text Categorization
Information Retrieval
Pattern Recognition and Neural Networks
Pattern Recognition and Neural Networks
Maximizing Text-Mining Performance
IEEE Intelligent Systems
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
IEEE Intelligent Systems
A scalability analysis of classifiers in text categorization
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Robustness of regularized linear classification methods in text categorization
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Recommender systems using linear classifiers
The Journal of Machine Learning Research
Non-word identification or spell checking without a dictionary
Journal of the American Society for Information Science and Technology
Text categorization for a comprehensive time-dependent benchmark
Information Processing and Management: an International Journal
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
Feature selection for text categorization on imbalanced data
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Focused named entity recognition using machine learning
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Solving large scale linear prediction problems using stochastic gradient descent algorithms
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Probabilistic score estimation with piecewise logistic regression
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Active learning using pre-clustering
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Knowledge management technology
IBM Systems Journal
The Combination of Text Classifiers Using Reliability Indicators
Information Retrieval
Diffusion Kernels on Statistical Manifolds
The Journal of Machine Learning Research
Sparse Multinomial Logistic Regression: Fast Algorithms and Generalization Bounds
IEEE Transactions on Pattern Analysis and Machine Intelligence
Best terms: an efficient feature-selection algorithm for text categorization
Knowledge and Information Systems
A Statistical Model for User Preference
IEEE Transactions on Knowledge and Data Engineering
Feature-rich part-of-speech tagging with a cyclic dependency network
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Robustness of adaptive filtering methods in a cross-benchmark evaluation
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Multi-labelled classification using maximum entropy method
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Learn to weight terms in information retrieval using category information
ICML '05 Proceedings of the 22nd international conference on Machine learning
Fast maximum margin matrix factorization for collaborative prediction
ICML '05 Proceedings of the 22nd international conference on Machine learning
Learning Gaussian processes from multiple tasks
ICML '05 Proceedings of the 22nd international conference on Machine learning
Hierarchy-Regularized Latent Semantic Indexing
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
The Synergy Between PAV and AdaBoost
Machine Learning
Cost-sensitive learning with conditional Markov networks
ICML '06 Proceedings of the 23rd international conference on Machine learning
Constructing informative prior distributions from domain knowledge in text classification
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Linear prediction models with graph regularization for web-page categorization
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
The importance of the lexicon in tagging biological text
Natural Language Engineering
Uncovering shared structures in multiclass classification
Proceedings of the 24th international conference on Machine learning
Regularized clustering for documents
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Feature selection methods for text classification
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Information-theoretic semantic multimedia indexing
Proceedings of the 6th ACM international conference on Image and video retrieval
An incremental cluster-based approach to spam filtering
Expert Systems with Applications: An International Journal
Stepwise feature selection using generalized logistic loss
Computational Statistics & Data Analysis
trNon-greedy active learning for text categorization using convex ansductive experimental design
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Algorithms for Sparse Linear Classifiers in the Massive Data Setting
The Journal of Machine Learning Research
Fast logistic regression for text categorization with variable-length n-grams
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Cost-sensitive learning with conditional Markov networks
Data Mining and Knowledge Discovery
Text Categorization in Non-linear Semantic Space
AI*IA '07 Proceedings of the 10th Congress of the Italian Association for Artificial Intelligence on AI*IA 2007: Artificial Intelligence and Human-Oriented Computing
Coordinate Descent Method for Large-scale L2-loss Linear Support Vector Machines
The Journal of Machine Learning Research
Constrained Local Regularized Transducer for Multi-Component Category Classification
PRICAI '08 Proceedings of the 10th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
Classifying search queries using the Web as a source of knowledge
ACM Transactions on the Web (TWEB)
Improving Automatic Text Classification by Integrated Feature Analysis
IEICE - Transactions on Information and Systems
ODE: Ontology-assisted data extraction
ACM Transactions on Database Systems (TODS)
Stochastic methods for l1 regularized loss minimization
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Annotating and learning compound noun semantics
ACL '07 Proceedings of the 45th Annual Meeting of the ACL: Student Research Workshop
International Journal of Approximate Reasoning
The ineffectiveness of within-document term frequency in text classification
Information Retrieval
Legal docket-entry classification: where machine learning stumbles
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
The value of parsing as feature generation for gene mention recognition
Journal of Biomedical Informatics
Two-way Poisson mixture models for simultaneous document classification and word clustering
Computational Statistics & Data Analysis
A Quasi-Newton Approach to Nonsmooth Convex Optimization Problems in Machine Learning
The Journal of Machine Learning Research
Project evaluation by e-mail communication pattern
HCI'07 Proceedings of the 12th international conference on Human-computer interaction: applications and services
Does SVM really scale up to large bag of words feature spaces?
IDA'07 Proceedings of the 7th international conference on Intelligent data analysis
SED: supervised experimental design and its application to text classification
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Designing efficient cascaded classifiers: tradeoff between accuracy and cost
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
An information-theoretic framework for semantic-multimedia retrieval
ACM Transactions on Information Systems (TOIS)
Multi-label boosting for image annotation by structural grouping sparsity
Proceedings of the international conference on Multimedia
Entropy and margin maximization for structured output learning
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
A sparse version of the ridge logistic regression for large-scale text categorization
Pattern Recognition Letters
Image annotation by sparse logistic regression
PCM'10 Proceedings of the Advances in multimedia information processing, and 11th Pacific Rim conference on Multimedia: Part II
A rough sets approach to user preference modeling
RSKT'10 Proceedings of the 5th international conference on Rough set and knowledge technology
The Journal of Machine Learning Research
Knowledge transfer based on feature representation mapping for text classification
Expert Systems with Applications: An International Journal
High-precision phrase-based document classification on a modern scale
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Stochastic Methods for l1-regularized Loss Minimization
The Journal of Machine Learning Research
A feature generation algorithm for sequences with application to splice-site prediction
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Removing smoothing from naive bayes text classifier
WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management
Multinomial naive bayes for text categorization revisited
AI'04 Proceedings of the 17th Australian joint conference on Advances in Artificial Intelligence
On the behavior of SVM and some older algorithms in binary text classification tasks
TSD'06 Proceedings of the 9th international conference on Text, Speech and Dialogue
On compression-based text classification
ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
Supervising latent topic model for maximum-margin text classification and regression
PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Feature weighting by RELIEF based on local hyperplane approximation
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Ranking importance based information on the world wide web
Proceedings of the International Conference on Advances in Computing, Communications and Informatics
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Text categorization using an ensemble classifier based on a mean co-association matrix
MLDM'12 Proceedings of the 8th international conference on Machine Learning and Data Mining in Pattern Recognition
Enhanced semantic TV-show representation for personalized electronic program guides
UMAP'12 Proceedings of the 20th international conference on User Modeling, Adaptation, and Personalization
Unifying local and global agreement and disagreement classification in online debates
WASSA '12 Proceedings of the 3rd Workshop in Computational Approaches to Subjectivity and Sentiment Analysis
Massive Parallelization of Serial Inference Algorithms for a Complex Generalized Linear Model
ACM Transactions on Modeling and Computer Simulation (TOMACS) - Special Issue on Monte Carlo Methods in Statistics
Understanding query interfaces by statistical parsing
ACM Transactions on the Web (TWEB)
Recursive regularization for large-scale classification with hierarchical and graphical dependencies
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Coherence functions with applications in large-margin classification methods
The Journal of Machine Learning Research
Information Technology and Management
Multiple instance learning via Gaussian processes
Data Mining and Knowledge Discovery
Hi-index | 0.00 |
A number of linear classification methods such as the linear least squares fit (LLSF), logistic regression, and support vector machines (SVM's) have been applied to text categorization problems. These methods share the similarity by finding hyperplanes that approximately separate a class of document vectors from its complement. However, support vector machines are so far considered special in that they have been demonstrated to achieve the state of the art performance. It is therefore worthwhile to understand whether such good performance is unique to the SVM design, or if it can also be achieved by other linear classification methods. In this paper, we compare a number of known linear classification methods as well as some variants in the framework of regularized linear systems. We will discuss the statistical and numerical properties of these algorithms, with a focus on text categorization. We will also provide some numerical experiments to illustrate these algorithms on a number of datasets.