Representation and learning in information retrieval
Representation and learning in information retrieval
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
A statistical learning learning model of text classification for support vector machines
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A study of thresholding strategies for text categorization
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Bayesian online classifiers for text classification and filtering
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Partially Supervised Classification of Text Documents
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Transductive Inference for Text Classification using Support Vector Machines
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
A Machine Learning Approach to Building Domain-Specific Search Engines
IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
PEBL: positive example based learning for Web page classification using SVM
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
One-class svms for document classification
The Journal of Machine Learning Research
Uniform object generation for optimizing one-class classifiers
The Journal of Machine Learning Research
Training ν-Support Vector Classifiers: Theory and Algorithms
Neural Computation
Neural Computation
SVMC: single-class classification with support vector machines
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Automatic new topic identification using multiple linear regression
Information Processing and Management: an International Journal
Computer Methods and Programs in Biomedicine
The link-prediction problem for social networks
Journal of the American Society for Information Science and Technology
Learning Bayesian classifiers from positive and unlabeled examples
Pattern Recognition Letters
Mutually beneficial learning with application to on-line news classification
Proceedings of the ACM first Ph.D. workshop in CIKM
Using the shape recovery method to evaluate indexing techniques
Journal of the American Society for Information Science and Technology
Automatic record linkage using seeded nearest neighbour and support vector machine classification
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Imbalanced text classification: A term weighting approach
Expert Systems with Applications: An International Journal
PORE: positive-only relation extraction from wikipedia text
ISWC'07/ASWC'07 Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference
Measuring the interestingness of articles in a limited user environment
Information Processing and Management: an International Journal
ACS'06 Proceedings of the 6th WSEAS international conference on Applied computer science
A pairwise ranking based approach to learning with positive and unlabeled examples
Proceedings of the 20th ACM international conference on Information and knowledge management
Leveraging one-class SVM and semantic analysis to detect anomalous content
ISI'05 Proceedings of the 2005 IEEE international conference on Intelligence and Security Informatics
Sampling the Web as Training Data for Text Classification
International Journal of Digital Library Systems
Learning from data streams with only positive and unlabeled data
Journal of Intelligent Information Systems
Hi-index | 0.00 |
Most existing studies of text classification assume that the training data are completely labeled. In reality, however, many information retrieval problems can be more accurately described as learning a binary classifier from a set of incompletely labeled examples, where we typically have a small number of labeled positive examples and a very large number of unlabeled examples. In this paper, we study such a problem of performing Text Classification WithOut labeled Negative data TC-WON). In this paper, we explore an efficient extension of the standard Support Vector Machine (SVM) approach, called SVMC (Support Vector Mapping Convergence) [17]for the TC-WON tasks. Our analyses show that when the positive training data is not too under-sampled, SVMC significantly outperforms other methods because SVMC basically exploits the natural "gap" between positive and negative documents in the feature space, which eventually corresponds to improving the generalization performance. In the text domain there are likely to exist many gaps in the feature space because a document is usually mapped to a sparse and high dimensional feature space. However, as the number of positive training data decreases, the boundary of SVMC starts overfitting at some point and end up generating very poor results.This is because when the positive training data is too few, the boundary over-iterates and trespasses the natural gaps between positive and negative class in the feature space and thus ends up fitting tightly around the few positive training data.