An automated oil palm literature text summarizer framework

  • Authors:
  • Hamimah Ujir;Nurfauza Jali;Daniel Tan Yong Wen;Sharin Hazlin Huspi;Syarifah Fazlin Seyed Fadzir;Stephanie Chua Hui Li

  • Affiliations:
  • Universiti Malaysia Sarawak, Malaysia;Universiti Malaysia Sarawak, Malaysia;Universiti Malaysia Sarawak, Malaysia;Universiti Malaysia Sarawak, Malaysia;Universiti Malaysia Sarawak, Malaysia;Universiti Malaysia Sarawak, Malaysia

  • Venue:
  • ACST '08 Proceedings of the Fourth IASTED International Conference on Advances in Computer Science and Technology
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Most of the existing summarization tools serve as a general purpose summarizer, rarely as the domain specific summarizer; e.g.: medical [14] and law [15] field documents summarizer. This paper describes a framework of an automatic summary generation of one specific domain that is oil palm literature. In order to support the whole framework, the oil palm corpus is developed. The work is based on two different paradigms which is extraction and abstraction. By incorporating these two important methods in one summarization framework, the quality of the produced summary will greatly improve. A Nearly-New IE (ANNIE) is used as the backbone in extraction process. The sentences are then ranked for potential inclusion in the summary using a weighted word frequency known as Term Frequency-Inverse Document Frequency (TF-IDF). In the abstraction process, the oil palm corpus is used to support the summarization procedure. Using the training corpus, the output will be more precise may gather all the important facts from the pre-determined information retrieval process.