Case-based reasoning
Machine Learning
Modern Information Retrieval
AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
Automatic occupation coding with combination of machine learning and hand-crafted rules
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Hi-index | 0.00 |
This paper describes our newly developed Automated Industry and Occupation Coding System (AIOCS). The main function of the system is to classify natural language responses of survey questionnaires into equivalent numeric codes according to the standard code book from the Korean National Statistics Office (KNSO). We implemented the system using a range of automated classification techniques, including hand-crafted rules, a maximum entropy model, and information retrieval techniques, to enhance the performance of automated industry/occupation coding task. The result is a Web-based AIOCS available for public services via the Web site of KNSO. Compared with the previous system developed in 2005, the new Web-based system decreases coding cost with a higher speed and shows significant performance enhancement in production rate and accuracy. Furthermore, it facilitates practical uses through an easy Web user interface.