Mining term networks from text collections for crime investigation

  • Authors:
  • Yuen-Hsien Tseng;Zih-Ping Ho;Kai-Sheng Yang;Chun-Cheng Chen

  • Affiliations:
  • Information Technology Center, National Taiwan Normal University, No. 162, Sec. 1, Heping East Road, Taipei City, Taiwan, ROC;Information Technology Center, National Taiwan Normal University, No. 162, Sec. 1, Heping East Road, Taipei City, Taiwan, ROC;Department of Forensic Science, Central Police University, Taiwan, ROC;Criminal Intelligence Center, Criminal Investigation Bureau, National Police Agency, Minister of the Interior, Taipei City, Taiwan, ROC

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2012

Quantified Score

Hi-index 12.05

Visualization

Abstract

An efficient term mining method to build a general term network is presented. The resulting term network can be used for entity relation visualization and exploration, which is useful in many text-mining applications such as crime exploration and investigation from vast piles of crime news or official criminal records. In the proposed method, terms from each document in a text collection are first identified. They are subjected to an analysis for pairwise association weights. The weights are then accumulated over all the documents to obtain final similarity for each term pair. Based on the resulting term similarity, a general term network for the collection is built with terms as nodes and non-zero similarities as links. In application, a list of predefined terms having similar attributes was selected to extract the desired sub-network from the general term network for entity relation visualization. This text analysis scenario based on the collective terms of the similar type or from the same topic enables evidence-based relation exploration. Some practical instances of crime exploration and investigation are demonstrated. Our application examples show that term relations, be it causality, subordination, coupling, or others, can be effectively revealed by our method and easily verified by the underlying text collection. This work contributes by presenting an integrated term-relationship mining and exploration approach and demonstrating the feasibility of the term network to the increasingly important application of crime exploration and investigation.