Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Transductive Inference for Text Classification using Support Vector Machines
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Integrating Background Knowledge into Nearest-Neighbor Text Classification
ECCBR '02 Proceedings of the 6th European Conference on Advances in Case-Based Reasoning
Authorship Attribution with Support Vector Machines
Applied Intelligence
Augmenting Naive Bayes Classifiers with Statistical Language Models
Information Retrieval
Introduction to the special issue on the web as corpus
Computational Linguistics - Special issue on web as corpus
Editorial: special issue on learning from imbalanced data sets
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
An evaluation of text classification methods for literary study
An evaluation of text classification methods for literary study
A comparison of statistical significance tests for information retrieval evaluation
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Authorship attribution using word sequences
CIARP'06 Proceedings of the 11th Iberoamerican conference on Progress in Pattern Recognition, Image Analysis and Applications
Effective and scalable authorship attribution using function words
AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
Authorship attribution of texts: a review
General Theory of Information Transfer and Combinatorics
Semi-supervised Word Sense Disambiguation Using the Web as Corpus
CICLing '09 Proceedings of the 10th International Conference on Computational Linguistics and Intelligent Text Processing
An Intelligent Agent That Autonomously Learns How to Translate
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
Using Nearest Neighbor Information to Improve Cross-Language Text Classification
MICAI '09 Proceedings of the 8th Mexican International Conference on Artificial Intelligence
NLDB'11 Proceedings of the 16th international conference on Natural language processing and information systems
Extended information inference model for unsupervised categorization of web short texts
Journal of Information Science
A document is known by the company it keeps: neighborhood consensus for short text categorization
Language Resources and Evaluation
An intelligent Web agent that autonomously learns how to translate
Web Intelligence and Agent Systems
Hi-index | 0.00 |
Most current methods for automatic text categorization are based on supervised learning techniques and, therefore, they face the problem of requiring a great number of training instances to construct an accurate classifier. In order to tackle this problem, this paper proposes a new semi-supervised method for text categorization, which considers the automatic extraction of unlabeled examples from the Web and the application of an enriched self-training approach for the construction of the classifier. This method, even though language independent, is more pertinent for scenarios where large sets of labeled resources do not exist. That, for instance, could be the case of several application domains in different non-English languages such as Spanish. The experimental evaluation of the method was carried out in three different tasks and in two different languages. The achieved results demonstrate the applicability and usefulness of the proposed method.