Standardization of interval symbolic data based on the empirical descriptive statistics

  • Authors:
  • Junpeng Guo;Wenhua Li;Chenhua Li;Sa Gao

  • Affiliations:
  • School of Management, Tianjin University, Tianjin 300072, China;School of Management, Tianjin University, Tianjin 300072, China;Department of Industrial & Systems Engineering, Texas A&M University, TX 77843, USA;School of Management, Tianjin University, Tianjin 300072, China

  • Venue:
  • Computational Statistics & Data Analysis
  • Year:
  • 2012

Quantified Score

Hi-index 0.03

Visualization

Abstract

In many statistical analysis methods, standardization of the sample data is usually recommended to prevent the results from being strongly affected by the scale of measurement of the variables. This paper focuses on the standardization of interval data obtained by symbolic data analysis (SDA). SDA is a new data analysis technique which captures the value of a variable with a symbolic representation. The empirical descriptive statistics of the interval symbolic variable are studied first. We then proposed the standardization method of interval symbolic data and conducted a simulation study to evaluate our standardization method by using clustering analysis. An application example on e-shops of several major cities in China is given at the end of the paper. Differing from previous research, we do not require the assumption of uniformly distributed data in the interval. Our method makes the best use of the original sample information.