Fuzzy evolutionary optimization modeling and its applications to unsupervised categorization and extractive summarization

  • Authors:
  • Wei Song;Lim Cheon Choi;Soon Cheol Park;Xiao Feng Ding

  • Affiliations:
  • School of IOT Engineering, Jiangnan University, Wuxi 214122, China and Department of Electronics and Information Engineering, Chonbuk National University, Jeonju 561756, South Korea;Department of Electronics and Information Engineering, Chonbuk National University, Jeonju 561756, South Korea;Department of Electronics and Information Engineering, Chonbuk National University, Jeonju 561756, South Korea;School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2011

Quantified Score

Hi-index 12.06

Visualization

Abstract

Modern information retrieval (IR) systems consist of many challenging components, e.g. clustering, summarization, etc. Nowadays, without browsing the whole volume of datasets, IR systems present users with clusters of documents they are interested in, and summarize each document briefly which facilitates the task of finding the desired documents. This paper proposes a fuzzy evolutionary optimization modeling (FEOM) and its applications to unsupervised categorization and extractive summarization. In view of the nature of biological evolution, we take advantage of several fuzzy control parameters to adaptively regulate the behaviors of the evolutionary optimization, which can effectively prevent premature convergence to a local optimal solution. As a portable, modular and extensively executable model, FEOM is firstly implemented for clustering text documents. The searching capability of FEOM is exploited to explore appropriate partitions of documents such that the similarity metric of the resulting clusters is optimized. In order to further investigate its effectiveness as a generic data clustering model, FEOM is then applied to sentence clustering based extractive document summarization. It selects the most important sentence from each cluster to represent the overall meaning of document. We demonstrate the improved performance by a series of experiments using standard test sets, e.g. Reuter document collection, 20-newsgroup corpus, DUC01 and DUC02, as evaluated by some commonly used metrics, i.e. F-measure and ROUGE. The experimental results show that FEOM achieves performance as good as or better than state of arts of clustering and summarizing systems.