epsilon-Support Vector and Large-Scale Data Mining Problems

Authors:
Gang Kou;Yi Peng;Yong Shi;Zhengxin Chen
Affiliations:
Thomson Co., R&D, 610 Opperman Drive, Eagan, MN 55123, USA and College of Information Science & Technology, University of Nebraska at Omaha, Omaha, NE 68182, USA;College of Information Science & Technology, University of Nebraska at Omaha, Omaha, NE 68182, USA;College of Information Science & Technology, University of Nebraska at Omaha, Omaha, NE 68182, USA and Chinese Academy of Sciences Research Center on Data Technology & Knowledge Economy, Graduate ...;College of Information Science & Technology, University of Nebraska at Omaha, Omaha, NE 68182, USA
Venue:
ICCS '07 Proceedings of the 7th international conference on Computational Science, Part III: ICCS 2007
Year:
2007

Citing 5
Cited 0

A training algorithm for optimal margin classifiers

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
The nature of statistical learning theory

The nature of statistical learning theory
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Advances in Large Margin Classifiers

Advances in Large Margin Classifiers
Mathematical programming approaches to machine learning and data mining

Mathematical programming approaches to machine learning and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data mining and knowledge discovery has made great progress during the last fifteen years. As one of the major tasks of data mining, classification has wide business and scientific applications. Among a variety of proposed methods, mathematical programming based approaches have been proven to be excellent in terms of classification accuracy, robustness, and efficiency. However, there are several difficult issues. Two of these issues are of particular interest of this research. The first issue is that it is challenging to find optimal solution for large-scale dataset in mathematical programming problems due to the computational complexity. The second issue is that many mathematical programming problems require specialized codes or programs such as CPLEX or LINGO. The objective of this study is to propose solutions for these two problems. This paper proposed and applied mathematical programming model to classification problems to address two aspects of data mining algorithm: speed and scalability.