Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Feature selection in SVM text categorization
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
A Simple Decomposition Method for Support Vector Machines
Machine Learning
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Performance Evaluation of Automatic Survey Classifiers
ICGI '98 Proceedings of the 4th International Colloquium on Grammatical Inference
On the algorithmic implementation of multiclass kernel-based vector machines
The Journal of Machine Learning Research
A comparison of methods for multiclass support vector machines
IEEE Transactions on Neural Networks
Document-Base Extraction for Single-Label Text Classification
DaWaK '08 Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery
Effects of Term Distributions on Binary Classification
IEICE - Transactions on Information and Systems
Automatic occupation coding with combination of machine learning and hand-crafted rules
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Hi-index | 0.00 |
Survey coding is the task of assigning a symbolic code from a predefined set of such codes to the answer given in response to an open-ended question in a questionnaire (aka survey). We formulate the problem of automated survey coding as a text categorization problem, i.e. as the problem of learning, by means of supervised machine learning techniques, a model of the association between answers and codes from a training set of pre-coded answers, and applying the resulting model to the classification of new answers. In this paper we experiment with two different learning techniques, one based on naïmillve Bayesian classification and the other one based on multiclass support vector machines, and test the resulting framework on a corpus of social surveys. The results we have obtained significantly outperform the results achieved by previous automated survey coding approaches.