Forensic investigation of OOXML format documents

  • Authors:
  • Zhangjie Fu;Xingming Sun;Yuling Liu;Bo Li

  • Affiliations:
  • College of Information Science and Engineering, Hunan University, 410082 Changsha, China;College of Information Science and Engineering, Hunan University, 410082 Changsha, China and Jiangsu Engineering Center of Network Monitoring, Nanjing University of Information Science and Technol ...;College of Information Science and Engineering, Hunan University, 410082 Changsha, China;College of Information Science and Engineering, Hunan University, 410082 Changsha, China

  • Venue:
  • Digital Investigation: The International Journal of Digital Forensics & Incident Response
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

MS Office documents could be illegally copied by offenders, and forensic investigators still face great difficulty in investigating and tracking the source of these illegal copies. This paper mainly proposes a forensic method based on the unique value of the revision identifier (RI) to determine the source of suspicious electronic documents. This method applies to electronic documents which use Office Open XML (OOXML) format, such as MS Office 2007, Mac Office 2008 and MS Office 2010. According to the uniqueness of the RI extracted from documents, forensic investigators can determine whether the suspicious document and another document are from the same source. Experiments demonstrate that, for a copy of an electronic document, even if all the original characters are deleted or formatted by attackers, forensic examiners can determine that the copy and the original document are from the same source through detecting the RI values. Additionally, the same holds true if attackers just copy some characters from the original document to a newly created document. As long as there is one character left whose original format has not been cleared, forensic examiners can determine that the two documents are from the same source using the same method. This paper also presents methods for OOXML format files to detect the time information and creator information, which can be used to determine who the real copyright holder is when a copyright dispute occurs.