An N-Gram and STF-IDF model for masquerade detection in a UNIX environment

  • Authors:
  • Dai Geng;Thmohiro Odaka;Jousuke Kuroiwa;Hisakazu Ogura

  • Affiliations:
  • Graduate School of Engineering, University of Fukui, Fukui, Japan;Graduate School of Engineering, University of Fukui, Fukui, Japan;Graduate School of Engineering, University of Fukui, Fukui, Japan;Graduate School of Engineering, University of Fukui, Fukui, Japan

  • Venue:
  • Journal in Computer Virology
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

A masquerader is someone who impersonates another user and operates a computer system with privileged access. Computer security problems caused by masqueraders are serious. Although anomaly detection is considered to be the best way to detect masqueraders, due to the low probability of detection and high error rate, this method is still in the research phase. Thus far, a number of methods, such as the Support Vector Machine (SVM), the Hidden Markov Model (HMM), and the Naïve Bayes (N. Bayes) classifier technique, have been investigated in order to further improve accuracy of detection. In the present paper, a method of integrating Data Mining and Natural Language Processing, namely, the N-Gram_Square root Term Frequency-Inverse Document Frequency (N-Gram_STF-IDF), is proposed. Using the proposed method, sequences to be detected are segmented via N-Gram characteristics, and non-normal users are then detected using a STF-IDF classifier. We perform an experiment using Schonlau and Greenberg data sets and the proposed method and compare the obtained results with results obtained using various other methods.