An intelligent technique to detect file formats and e-mail spam

  • Authors:
  • Ranaganayakulu Dhanalakshmi;Chenniappan Chellappan

  • Affiliations:
  • Anna University, Chennai, TamilNadu, India;Anna University, Chennai, TamilNadu, India

  • Venue:
  • Proceedings of the 1st Amrita ACM-W Celebration on Women in Computing in India
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the everyday increasing importance of privacy, security, and wise use of computational resources, the corresponding technologies are increasingly being faced with the problem of file type detection. In digital forensic, there are numerous file formats in use. Criminals have started using either non-standard file formats or changing extensions of files while storing or transmitting them over a network. This makes recovering data out of these files difficult. An extension to the file name with the file type is stored in the disk directory, but when a file is deleted, the entry for the file in the directory may be overwritten and hence quite difficult to identify its type which is serious issue in computer forensics. But if the fragment of file has its header information containing type identifying information the mentioned problem may be solved. But it is difficult to identify the type of fragment from the middle or if the header information is deleted or unavailable the identification becomes more complex. This paper focuses on identifying the file types addressing the various scenarios of file type being changed by the malicious user to send some confidential or sensitive information by changing the file type (say.exe banned by Gmail can be converted to any acceptable format and sent across). E-mail spam has become an epidemic problem that can negatively affect the usability of electronic mail as a communication means. Besides wasting users' time and effort to scan and delete the massive amount of junk e-mails received, it consumes network bandwidth and storage space, slows down email servers, and provides a medium to distribute harmful and/or offensive content. Inspired by the success of fuzzy similarity in text classification and document retrieval, the approach investigates its effectiveness in filtering spam based on the textual content of e-mail messages.