Modern Information Retrieval
Ensemble selection from libraries of models
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Training linear SVMs in linear time
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Maximizing classifier utility when there are data acquisition and modeling costs
Data Mining and Knowledge Discovery
Improving data mining utility with projective sampling
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Metric learning for synonym acquisition
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Cutting-plane training of structural SVMs
Machine Learning
Query-adaptive ranking with support vector machines for protein homology prediction
ISBRA'11 Proceedings of the 7th international conference on Bioinformatics research and applications
Discrimination-Based criteria for the evaluation of classifiers
FQAS'06 Proceedings of the 7th international conference on Flexible Query Answering Systems
Mal-ID: automatic malware detection using common segment analysis and meta-features
The Journal of Machine Learning Research
Hi-index | 0.00 |
This paper summarizes and analyzes the results of the 2004 KDD-Cup. The competition consisted of two tasks from the areas of particle physics and protein homology detection. It focused on the problem of optimizing supervised learning to different performance measures (accuracy, cross-entropy, ROC area, SLAC-Q, squared error, average precision, top 1, and rank of last). A total of 102 groups participated in the competition, 6 of which received awards or honorable mentions. Their approaches are described in other papers in this issue of SIGKDD Explorations. In this paper we do not analyze any particular approach, but give insight into the performance of the field of competitors as a whole. We study what fraction of the participants found good solutions, how well participants were able to optimize to different performance measures, how homogeneous their submitted predictions are, and if the best submissions represent the maximal performances that could reasonably be achieved. We are keeping the KDD-Cup 2004 WWW site open and have added an automatic scoring system for new submissions in order to encourage further research in this area.