A maximum entropy approach to natural language processing
Computational Linguistics
Introduction to Algorithms
Compound noun segmentation based on lexical data extracted from corpus
Natural Language Engineering
Hi-index | 0.00 |
As speech interface for in-vehicle emerges, isolated word based large POI (point of interest) vocabulary is one of the crucial problems in speech recognition system for low-cost and small devices. The speech interface needs all speech variations of the POI vocabulary since the interface depends on the isolated word based approach. In the case of Korean, the POI word is a compound noun without blanks. Therefore, the system requires the method of compound noun segmentation and tagging for building speech variations. This paper suggests the technique of Korean POI compound noun segmentation and tagging for building the speech variations. We adopt dynamic programming in order to segment POI word and maximum entropy model in semantic tagging for each sub-word, which consists of POI compound nouns. We show that these approaches can apply to the generation of POI speech variations.