Extracting Semantic Frames from Thai Medical-Symptom Phrases with Unknown Boundaries

Authors:
Peerasak Intarapaiboon;Ekawit Nantajeewarawat;Thanaruk Theeramunkong
Affiliations:
School of Information and Computer Technology Sirindhorn International Institute of Technology, Thammasat University, Pathumthani, Thailand;School of Information and Computer Technology Sirindhorn International Institute of Technology, Thammasat University, Pathumthani, Thailand;School of Information and Computer Technology Sirindhorn International Institute of Technology, Thammasat University, Pathumthani, Thailand
Venue:
ASWC '08 Proceedings of the 3rd Asian Semantic Web Conference on The Semantic Web
Year:
2008

Citing 7
Cited 0

Learning Information Extraction Rules for Semi-Structured and Free Text

Machine Learning - Special issue on natural language learning
Machine Learning for Information Extraction in Informal Domains

Machine Learning - Special issue on information retrieval
Bottom-up relational learning of pattern matching rules for information extraction

The Journal of Machine Learning Research
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Automatic corpus-based Thai word extraction with the c4.5 learning algorithm

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Two-phase learning for biological event extraction and verification

ACM Transactions on Asian Language Information Processing (TALIP)
Towards ontology enrichment with treatment relations extracted from medical abstracts

ICADL'06 Proceedings of the 9th international conference on Asian Digital Libraries: achievements, Challenges and Opportunities

Quantified Score

Hi-index	0.00

Visualization

Abstract

Due to the limitations of language-processing tools for the Thai language, pattern-based information extraction from Thai documents requires supplementary techniques. Based on sliding-window rule application and extraction filtering, we present a framework for extracting semantic information from medical-symptom phrases with unknown boundaries in Thai free-text information entries. A supervised rule learning algorithm is employed for automatic construction of information extraction rules from hand-tagged training symptom phrases. Two filtering components are introduced: one uses a classification model for predicting rule application across a symptom-phrase boundary, the other uses extraction distances observed during rule learning for resolving conflicts arising from overlapping-frame extractions. In our experimental study, we focus our attention on two basic types of symptom phrasal descriptions: one is concerned with abnormal characteristics of some observable entities and the other with human-body locations at which symptoms appear. The experimental results show that the filtering components improve precision while preserving recall satisfactorily.