Optimizing convenient online access to bibliographic databases
Information Services and Use
Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Document retrieval systems
Readings in information retrieval
Readings in information retrieval
Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
Automatic essay grading using text categorization techniques
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Feature selection in SVM text categorization
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
A Simple Decomposition Method for Support Vector Machines
Machine Learning
Text classification using ESC-based stochastic decision lists
Information Processing and Management: an International Journal
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Performance Evaluation of Automatic Survey Classifiers
ICGI '98 Proceedings of the 4th International Colloquium on Grammatical Inference
On the algorithmic implementation of multiclass kernel-based vector machines
The Journal of Machine Learning Research
Automated scoring using a hybrid feature identification technique
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Hi-index | 0.00 |
Survey coding is the task of assigning a symbolic code from a predefined set of such codes to the answer given in response to an open-ended question in a questionnaire (aka survey). This task is usually carried out to group respondents according to a predefined scheme based on their answers. Survey coding has several applications, especially in the social sciences, ranging from the simple classification of respondents to the extraction of statistics on political opinions, health and lifestyle habits, customer satisfaction, brand fidelity, and patient satisfaction. Survey coding is a difficult task, because the code that should be attributed to a respondent based on the answer she has given is a matter of subjective judgment, and thus requires expertise. It is thus unsurprising that this task has traditionally been performed manually, by trained coders. Some attempts have been made at automating this task, most of them based on detecting the similarity between the answer and textual descriptions of the meanings of the candidate codes. We take a radically new stand, and formulate the problem of automated survey coding as a text categorization problem, that is, as the problem of learning, by means of supervised machine learning techniques, a model of the association between answers and codes from a training set of precoded answers, and applying the resulting model to the classification of new answers. In this article we experiment with two different learning techniques: one based on naive Bayesian classification, and the other one based on multiclass support vector machines, and test the resulting framework on a corpus of social surveys. The results we have obtained significantly outperform the results achieved by previous automated survey coding approaches.