Outlier detection and visualization of large datasets

  • Authors:
  • L. Gunisetti

  • Affiliations:
  • Sri Vasavi Engineering College, Pedatadepalli, Tadepalligudem, A. P., India

  • Venue:
  • Proceedings of the International Conference & Workshop on Emerging Trends in Technology
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Outliers are special observations or extraordinary cases in the available data which deviate so much from other observations so as to arouse suspicions that they were generated by a different mechanism. Outliers detected can be used to identify special or extraordinary or fraudulent cases in day to day transactions. Outlier detection can be used to identify the noise in the data and these detected outliers have to be removed to improve data quality. Outlier Detection can be used for Traffic Analysis, Credit Card Fraud Detection. We applied Outlier Detection to Traffic data set for identifying the outlier stations on the highway. Detected outlier stations represent abnormalities in the traffic sensors data. This information is used by us to identify the faulty traffic sensors located at the highway stations. We have provided two dimensional visualization of the outliers which can be used for analyzing the data in an efficient manner. Traffic Management becomes easier when the abnormal traffic sensors identified at the corresponding outlier stations are identified. The method used here is a Statistic Approach. This technique compares every location to its neighbors using the Statistic. The Statistic is calculated to identify whether the data generated at a highway traffic station sensor is abnormal or not. This technique can be used efficiently to identify the outliers. This method can be easily applied to very large datasets as compared to existing conventional approaches.