Principal Components of Port-Address Matrices in Port-Scan Analysis

  • Authors:
  • Hiroaki Kikuchi;Naoya Fukuno;Masato Terada;Norihisa Doi

  • Affiliations:
  • School of Information Technology, Tokai University, Hiratsuka, Japan 259-1292;School of Information Technology, Tokai University, Hiratsuka, Japan 259-1292;Hitachi, Ltd. Hitachi Incident Response Team (HIRT), Kawasaki, Japan 212-8567;Dept. of Info. and System Engineering, Facility of Science and Engineering, Chuo University, Tokyo, Japan 112-8551

  • Venue:
  • OTM '08 Proceedings of the OTM 2008 Confederated International Conferences, CoopIS, DOA, GADA, IS, and ODBASE 2008. Part II on On the Move to Meaningful Internet Systems
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

There are many studies aiming at using port-scan traffic data for the fast and accurate detection of rapidly spreading worms. This paper proposes two new methods for reducing the traffic data to a simplified form comprising significant components of smaller dimensionality. (1) Dimension reduction via Term Frequency --- Inverse Document Frequency (TF-IDF) values, a technique used in information retrieval, is used to choose significant ports and addresses in terms of their "importance" for classification. (2) Dimension reduction via Principal Component Analysis (PCA), widely used as a tool in exploratory data analysis, enables estimation of how uniformly the sensors are distributed over the reduced coordinate system. PCA gives a scatter plot for the sensors, which helps to detect abnormal behavior in both the source address space and the destination port space. In addition to our proposals, we report on experiments that use the Internet Scan Data Acquisition System (ISDAS) distributed observation data from the Japan Computer Emergency Response Team (JPCERT).