Trading MIPS and memory for knowledge engineering
Communications of the ACM
Modern Information Retrieval
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Hi-index | 0.00 |
This paper describes an application of information retrieval techniques to automated industry and occupation code classification for Korean Census records. The purpose of the proposed system is to convert natural language responses on survey questionnaires into corresponding numeric codes according to standard code book from the Census Bureau. The system was experimented with 46,762 industry records and occupation 36,286 records using 10-fold cross-validation evaluation method. As experimental results, the system showed 87.08% and 66.08% production rates when classifying industry records into level 2 and level 5 codes respectively. In semi-automated mode, it showed 99.10% and 92.88% production rates for level 2 and level 5 codes respectively.