Generalization Bounds for the Area Under the ROC Curve
The Journal of Machine Learning Research
A support vector method for multivariate performance measures
ICML '05 Proceedings of the 22nd international conference on Machine learning
Training linear SVMs in linear time
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient projections onto the l1-ball for learning in high dimensions
Proceedings of the 25th international conference on Machine learning
Asymmetric support vector machines: low false-positive learning under the user tolerance
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
KDD cup 2008 and the workshop on mining medical data
ACM SIGKDD Explorations Newsletter
Primal sparse Max-margin Markov networks
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Cutting-plane training of structural SVMs
Machine Learning
The P-Norm Push: A Simple Convex Ranking Algorithm that Concentrates at the Top of the List
The Journal of Machine Learning Research
Partial AUC maximization in a linear combination of dichotomizers
Pattern Recognition
Sequential Alternating Proximal Method for Scalable Sparse Structural SVMs
ICDM '12 Proceedings of the 2012 IEEE 12th International Conference on Data Mining
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Hi-index | 0.00 |
The area under the ROC curve (AUC) is a well known performance measure in machine learning and data mining. In an increasing number of applications, however, ranging from ranking applications to a variety of important bioinformatics applications, performance is measured in terms of the partial area under the ROC curve between two specified false positive rates. In recent work, we proposed a structural SVM based approach for optimizing this performance measure (Narasimhan and Agarwal, 2013). In this paper, we develop a new support vector method, SVMpAUCtight, that optimizes a tighter convex upper bound on the partial AUC loss, which leads to both improved accuracy and reduced computational complexity. In particular, by rewriting the empirical partial AUC risk as a maximum over subsets of negative instances, we derive a new formulation, where a modified form of the earlier optimization objective is evaluated on each of these subsets, leading to a tighter hinge relaxation on the partial AUC loss. As with our previous method, the resulting optimization problem can be solved using a cutting-plane algorithm, but the new method has better run time guarantees. We also discuss a projected subgradient method for solving this problem, which offers additional computational savings in certain settings. We demonstrate on a wide variety of bioinformatics tasks, ranging from protein-protein interaction prediction to drug discovery tasks, that the proposed method does, in many cases, perform significantly better on the partial AUC measure than the previous structural SVM approach. In addition, we also develop extensions of our method to learn sparse and group sparse models, often of interest in biological applications.