Research on extracting subject from Chinese text (poster session)

  • Authors:
  • Han Kesong;Wang Yongcheng;Wu Fangfang

  • Affiliations:
  • Shanghai Jiaotong University, Department of Computer Science, Shanghai, P.R. China;Shanghai Jiaotong University, Department of Computer Science, Shanghai, P.R. China;Shanghai Jiaotong University, Department of Computer Science, Shanghai, P.R. China

  • Venue:
  • IRAL '00 Proceedings of the fifth international workshop on on Information retrieval with Asian languages
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Because of the agility and diversity of natural languages, extracting the subject of text is one of the most difficult but important tasks in natural language processing (NLP). Due to the unique linguistics and grammar structures of Chinese, we now can only adopt non-semantic based approaches to extract subject from Chinese text. Three different approaches of extracting subject from Chinese text are presented in this paper. The first one is based a component-word dictionary, the second one is based on a subject-word dictionary and the third one is based on a statistic method. We introduce the process of the approaches. To test our approaches, we develop three independent systems and design a comparison experiment. The experimental results are illuminating and inspiring: every system can extract the text's subject to some extent, however, we may need combine these approaches to get a better one.