Cost effective depression patient thought record categorization via self-taught learning

Authors:
Hua Wang;Heng Huang;Monica Basco;Molly Lopez;Fillia Makedon
Affiliations:
University of Texas at Arlington, TX;University of Texas at Arlington, TX;University of Texas at Arlington, TX;The University of Texas at Austin, TX;University of Texas at Arlington, TX
Venue:
Proceedings of the 4th International Conference on PErvasive Technologies Related to Assistive Environments
Year:
2011

Citing 13
Cited 1

Pairwise classification and support vector machines

Advances in kernel methods
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
The Handbook of Brain Theory and Neural Networks

The Handbook of Brain Theory and Neural Networks
Single-shot detection of multiple categories of text using parametric mixture models

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Pattern Recognition and Machine Learning (Information Science and Statistics)

Pattern Recognition and Machine Learning (Information Science and Statistics)
Introduction to Data Mining and its Applications (Studies in Computational Intelligence)

Introduction to Data Mining and its Applications (Studies in Computational Intelligence)
Self-taught learning: transfer learning from unlabeled data

Proceedings of the 24th international conference on Machine learning
Self-taught clustering

Proceedings of the 25th international conference on Machine learning
Introduction to Information Retrieval

Introduction to Information Retrieval
Self-taught learning

Self-taught learning
A Survey on Transfer Learning

IEEE Transactions on Knowledge and Data Engineering
Which is the best multiclass SVM method? an empirical study

MCS'05 Proceedings of the 6th international conference on Multiple Classifier Systems
A comparison of methods for multiclass support vector machines

IEEE Transactions on Neural Networks

Self-taught learning via exponential family sparse coding for cost-effective patient thought record categorization

Personal and Ubiquitous Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Automatic patient though record categorization (TR) is important in Cognitive Behavior Therapy (CBT), which is an useful augmentation of standard clinic treatment for Major Depressive Disorder (MDD). Because both collection and labeling of TR data are expensive, it is cost prohibitive to require a large amount of TR data, as well as their cor-responding category labels, to train a classification model with high classification accuracy. As in practice we only have very limited amount of labeled and unlabeled training TR data, traditional semi-supervised learning methods and transfer learning methods, which are the most commonly used strategies to deal with the lack of training data in statistical learning, can not work well in the task of automatic TR categorization. With the recognition of these challenges, in this paper we propose to approach the TR categorization problem from a new perspective via self-taught learning, an emerging topic in machine learning. Self-taught learning is a special type of transfer learning, instead of requiring labeled data from an auxiliary domain that are relevant to the classification task of interest as in traditional transfer learning methods, it learns the inherent structures of the auxiliary data and does not require their labels. Consequently, we may learn a classifier using the limited amount of labeled TR texts to achieve decent classification accuracy, with the assistance from the large amount text data obtained from some cheap, or even no-cost, resources. That is, a cost effective TR categorization system can be built up, which is of particular use in practical diagnosis and training of new therapists. We demonstrate the proposed method in the task of classifying the real depression homework texts, where promising experimental results validate the effectiveness of our new method.