Original Contribution: Stacked generalization
Neural Networks
Making large-scale support vector machine learning practical
Advances in kernel methods
KDD-Cup 2000: question 1 winner's report
ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
Learning Decision Trees Using the Area Under the ROC Curve
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
A support vector method for multivariate performance measures
ICML '05 Proceedings of the 22nd international conference on Machine learning
Making the most of your data: KDD Cup 2007 "How Many Ratings" winner's report
ACM SIGKDD Explorations Newsletter - Special issue on visual analytics
Analytics-driven solutions for customer targeting and sales-force allocation
IBM Systems Journal
Customer targeting models using actively-selected web content
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Guest editorial: special issue on utility-based data mining
Data Mining and Knowledge Discovery
KDD cup 2008 and the workshop on mining medical data
ACM SIGKDD Explorations Newsletter
Breast cancer identification: KDD CUP winner's report
ACM SIGKDD Explorations Newsletter
A discriminative learning framework with pairwise constraints for video object classification
CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Guest Editorial: Special Issue on impacting patient care by mining medical data
Data Mining and Knowledge Discovery
Leakage in data mining: formulation, detection, and avoidance
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Data mining with a parallel rule induction system based on gene expression programming
International Journal of Innovative Computing and Applications
InstanceRank based on borders for instance selection
Pattern Recognition
Leakage in data mining: Formulation, detection, and avoidance
ACM Transactions on Knowledge Discovery from Data (TKDD) - Special Issue on the Best of SIGKDD 2011
Artificial intelligence models to stratify cardiovascular risk in incident hemodialysis patients
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
Two major data mining competitions in 2008 presented challenges in medical domains: KDD Cup 2008, which concerned cancer detection from mammography data; and Informs Data Mining Challenge 2008, dealing with diagnosis of pneumonia based on patient information from hospital files. Our team won both of these competitions, and in this paper we share our lessons learned and insights. We emphasize the aspects that pertain to the general practice and methodology of medical data mining, rather than to the specifics of each modeling competition. We concentrate on three topics: information leakage, its effect on competitions and proof-of-concept projects; consideration of real-life model performance measures in model construction and evaluation; and relational learning approaches to medical data mining tasks.