Extracting Semantic Frames from Thai Medical-Symptom Phrases with Unknown Boundaries

  • Authors:
  • Peerasak Intarapaiboon;Ekawit Nantajeewarawat;Thanaruk Theeramunkong

  • Affiliations:
  • School of Information and Computer Technology Sirindhorn International Institute of Technology, Thammasat University, Pathumthani, Thailand;School of Information and Computer Technology Sirindhorn International Institute of Technology, Thammasat University, Pathumthani, Thailand;School of Information and Computer Technology Sirindhorn International Institute of Technology, Thammasat University, Pathumthani, Thailand

  • Venue:
  • ASWC '08 Proceedings of the 3rd Asian Semantic Web Conference on The Semantic Web
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Due to the limitations of language-processing tools for the Thai language, pattern-based information extraction from Thai documents requires supplementary techniques. Based on sliding-window rule application and extraction filtering, we present a framework for extracting semantic information from medical-symptom phrases with unknown boundaries in Thai free-text information entries. A supervised rule learning algorithm is employed for automatic construction of information extraction rules from hand-tagged training symptom phrases. Two filtering components are introduced: one uses a classification model for predicting rule application across a symptom-phrase boundary, the other uses extraction distances observed during rule learning for resolving conflicts arising from overlapping-frame extractions. In our experimental study, we focus our attention on two basic types of symptom phrasal descriptions: one is concerned with abnormal characteristics of some observable entities and the other with human-body locations at which symptoms appear. The experimental results show that the filtering components improve precision while preserving recall satisfactorily.