Machine Learning
Selection of relevant features and examples in machine learning
Artificial Intelligence - Special issue on relevance
Mining interesting knowledge using DM-II
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Machine Learning
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data (Data-Centric Systems and Applications)
Text classification from unlabeled documents with bootstrapping and feature projection techniques
Information Processing and Management: an International Journal
A machine learning approach to building domain-specific search engines
IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
A framework for discovering and classifying ubiquitous services in digital health ecosystems
Journal of Computer and System Sciences
Feature-based opinion mining and ranking
Journal of Computer and System Sciences
Task-specific information retrieval systems for software engineers
Journal of Computer and System Sciences
Foreword: Information Retrieval, Decision Making Process and User Needs
Journal of Computer and System Sciences
Text Classification Using a Graph of Terms
CISIS '12 Proceedings of the 2012 Sixth International Conference on Complex, Intelligent, and Software Intensive Systems (CISIS)
Hi-index | 0.00 |
Supervised text classifiers need to learn from many labeled examples to achieve a high accuracy. However, in a real context, sufficient labeled examples are not always available because human labeling is enormously time-consuming. For this reason, there has been recent interest in methods that are capable of obtaining a high accuracy when the size of the training set is small. In this paper we introduce a new single label text classification method that performs better than baseline methods when the number of labeled examples is small. Differently from most of the existing methods that usually make use of a vector of features composed of weighted words, the proposed approach uses a structured vector of features, composed of weighted pairs of words. The proposed vector of features is automatically learned, given a set of documents, using a global method for term extraction based on the Latent Dirichlet Allocation implemented as the Probabilistic Topic Model. Experiments performed using a small percentage of the original training set (about 1%) confirmed our theories.