Natural language processing and e-Government: crime information extraction from heterogeneous data sources

  • Authors:
  • Chih Hao Ku;Alicia Iriberri;Gondy Leroy

  • Affiliations:
  • Claremont Graduate University, Claremont, CA;Claremont Graduate University, Claremont, CA;Claremont Graduate University, Claremont, CA

  • Venue:
  • dg.o '08 Proceedings of the 2008 international conference on Digital government research
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Much information that could help solve and prevent crimes is never gathered because the reporting methods available to citizens and law enforcement personnel are not optimal. Detectives do not have sufficient time to interview crime victims and witnesses. Moreover, many victims and witnesses are too scared or embarrassed to report incidents. We are developing an interviewing system that will help collect such information. We report here on one component, the crime information extraction module, which uses natural language processing to extract crime information from police reports, newspaper articles, and victims' and witnesses' crime narratives. We tested our approach with two types of document: police and witness narrative reports. Our algorithms extract crime-related information, namely weapons, vehicles, time, people, clothes, and locations. We achieved high precision (96%) and recall (83%) for police narrative reports and comparable precision (93%) but somewhat lower recall (77%) for witness narrative reports. The difference in recall was significant at p