epsilon-Support Vector and Large-Scale Data Mining Problems

  • Authors:
  • Gang Kou;Yi Peng;Yong Shi;Zhengxin Chen

  • Affiliations:
  • Thomson Co., R&D, 610 Opperman Drive, Eagan, MN 55123, USA and College of Information Science & Technology, University of Nebraska at Omaha, Omaha, NE 68182, USA;College of Information Science & Technology, University of Nebraska at Omaha, Omaha, NE 68182, USA;College of Information Science & Technology, University of Nebraska at Omaha, Omaha, NE 68182, USA and Chinese Academy of Sciences Research Center on Data Technology & Knowledge Economy, Graduate ...;College of Information Science & Technology, University of Nebraska at Omaha, Omaha, NE 68182, USA

  • Venue:
  • ICCS '07 Proceedings of the 7th international conference on Computational Science, Part III: ICCS 2007
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data mining and knowledge discovery has made great progress during the last fifteen years. As one of the major tasks of data mining, classification has wide business and scientific applications. Among a variety of proposed methods, mathematical programming based approaches have been proven to be excellent in terms of classification accuracy, robustness, and efficiency. However, there are several difficult issues. Two of these issues are of particular interest of this research. The first issue is that it is challenging to find optimal solution for large-scale dataset in mathematical programming problems due to the computational complexity. The second issue is that many mathematical programming problems require specialized codes or programs such as CPLEX or LINGO. The objective of this study is to propose solutions for these two problems. This paper proposed and applied mathematical programming model to classification problems to address two aspects of data mining algorithm: speed and scalability.