Efficient appointment information extraction from short messages in mobile devices with limited hardware resources

Authors:
Choong-Nyoung Seon;Harksoo Kim;Jungyun Seo
Affiliations:
Department of Computer Science, Sogang University, Republic of Korea;Department of Computer and Communications Engineering, College of IT, Kangwon National University, Chuncheon-si, Kangwon-do 200-701, Republic of Korea;Department of Computer Science & Interdisciplinary Program of Integrated Biotechnology, Sogang University, Republic of Korea
Venue:
Pattern Recognition Letters
Year:
2011

Citing 10
Cited 0

Naive Bayesian Classifier Committees

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
A hybrid approach for named entity and sub-type tagging

ANLC '00 Proceedings of the sixth conference on Applied natural language processing
A maximum entropy approach to identifying sentence boundaries

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Chinese unknown word identification using character-based tagging and chunking

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 2
Named entity extraction based on a maximum entropy model and transformation rules

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
High speed unknown word prediction using support vector machine for chinese text-to-speech systems

IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Extracting information from short messages

NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems
Voting between multiple data representations for text chunking

AI'05 Proceedings of the 18th Canadian Society conference on Advances in Artificial Intelligence

Quantified Score

Hi-index	0.10

Visualization

Abstract

With the rapid evolution of the mobile environment, demand for information extraction from mobile devices is increasing. This paper proposes an information extraction system that is designed for mobile devices with limited hardware resources. The proposed system extracts temporal (dates and times) and named instances (locations and title) from Korean short messages in an appointment management domain. To efficiently extract temporal instances with limited numbers of surface forms, the proposed system uses well-refined finite state automata. To effectively extract various surface forms of named instances with limited hardware resources, the proposed system uses a modified hidden Markov model (HMM) based on character n-grams. In the experiment on instance boundary labeling, the proposed system showed comparable performances with representative conventional classifiers. The proposed system was implemented in a commercial mobile phone to test its ability to automatically extract appointment information from a short message and store the information into a schedule database. The system performed well with a reasonable response time.