Pairwise classification and support vector machines
Advances in kernel methods
An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
The Handbook of Brain Theory and Neural Networks
The Handbook of Brain Theory and Neural Networks
Single-shot detection of multiple categories of text using parametric mixture models
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Pattern Recognition and Machine Learning (Information Science and Statistics)
Pattern Recognition and Machine Learning (Information Science and Statistics)
Introduction to Data Mining and its Applications (Studies in Computational Intelligence)
Introduction to Data Mining and its Applications (Studies in Computational Intelligence)
Self-taught learning: transfer learning from unlabeled data
Proceedings of the 24th international conference on Machine learning
Proceedings of the 25th international conference on Machine learning
Introduction to Information Retrieval
Introduction to Information Retrieval
Self-taught learning
IEEE Transactions on Knowledge and Data Engineering
Which is the best multiclass SVM method? an empirical study
MCS'05 Proceedings of the 6th international conference on Multiple Classifier Systems
A comparison of methods for multiclass support vector machines
IEEE Transactions on Neural Networks
Personal and Ubiquitous Computing
Hi-index | 0.00 |
Automatic patient though record categorization (TR) is important in Cognitive Behavior Therapy (CBT), which is an useful augmentation of standard clinic treatment for Major Depressive Disorder (MDD). Because both collection and labeling of TR data are expensive, it is cost prohibitive to require a large amount of TR data, as well as their cor-responding category labels, to train a classification model with high classification accuracy. As in practice we only have very limited amount of labeled and unlabeled training TR data, traditional semi-supervised learning methods and transfer learning methods, which are the most commonly used strategies to deal with the lack of training data in statistical learning, can not work well in the task of automatic TR categorization. With the recognition of these challenges, in this paper we propose to approach the TR categorization problem from a new perspective via self-taught learning, an emerging topic in machine learning. Self-taught learning is a special type of transfer learning, instead of requiring labeled data from an auxiliary domain that are relevant to the classification task of interest as in traditional transfer learning methods, it learns the inherent structures of the auxiliary data and does not require their labels. Consequently, we may learn a classifier using the limited amount of labeled TR texts to achieve decent classification accuracy, with the assistance from the large amount text data obtained from some cheap, or even no-cost, resources. That is, a cost effective TR categorization system can be built up, which is of particular use in practical diagnosis and training of new therapists. We demonstrate the proposed method in the task of classifying the real depression homework texts, where promising experimental results validate the effectiveness of our new method.