Naive Bayesian Classifier Committees
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
A hybrid approach for named entity and sub-type tagging
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
A maximum entropy approach to identifying sentence boundaries
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Chinese unknown word identification using character-based tagging and chunking
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 2
Named entity extraction based on a maximum entropy model and transformation rules
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
High speed unknown word prediction using support vector machine for chinese text-to-speech systems
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Extracting information from short messages
NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems
Voting between multiple data representations for text chunking
AI'05 Proceedings of the 18th Canadian Society conference on Advances in Artificial Intelligence
Hi-index | 0.10 |
With the rapid evolution of the mobile environment, demand for information extraction from mobile devices is increasing. This paper proposes an information extraction system that is designed for mobile devices with limited hardware resources. The proposed system extracts temporal (dates and times) and named instances (locations and title) from Korean short messages in an appointment management domain. To efficiently extract temporal instances with limited numbers of surface forms, the proposed system uses well-refined finite state automata. To effectively extract various surface forms of named instances with limited hardware resources, the proposed system uses a modified hidden Markov model (HMM) based on character n-grams. In the experiment on instance boundary labeling, the proposed system showed comparable performances with representative conventional classifiers. The proposed system was implemented in a commercial mobile phone to test its ability to automatically extract appointment information from a short message and store the information into a schedule database. The system performed well with a reasonable response time.