Distance-based feature selection from probabilistic data

  • Authors:
  • Tingting Zhao;Bin Pei;Suyun Zhao;Hong Chen;Cuiping Li

  • Affiliations:
  • Key Lab of Data Engineering and Knowledge Engineering, Ministry of Education, China,Department of Computer Science, Renmin University of China, China;Key Lab of Data Engineering and Knowledge Engineering, Ministry of Education, China,Department of Computer Science, Renmin University of China, China;Key Lab of Data Engineering and Knowledge Engineering, Ministry of Education, China;Key Lab of Data Engineering and Knowledge Engineering, Ministry of Education, China,Department of Computer Science, Renmin University of China, China;Key Lab of Data Engineering and Knowledge Engineering, Ministry of Education, China,Department of Computer Science, Renmin University of China, China

  • Venue:
  • WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Feature selection is a powerful tool of dimension reduction from datasets. In the last decade, more and more researchers have paid attentions on feature selection. Further, some researchers begin to focus on feature selection from probabilistic datasets. However, in the existing method of feature selection from probabilistic data, the distance hidden in probabilistic data is neglected. In this paper, we design a new distance measure to select informative feature from probabilistic databases, in which both the distance and randomness in the data are considered. And then, we propose a feature selection algorithm based on the new distance and develop two accelerative algorithms to boost the computation. Furthermore, we introduce a parameter into the distance to reduce the sensitivity to noise. Finally, the experimental results verify the effectiveness of our algorithms.