Fundamentals of database systems
Fundamentals of database systems
C4.5: programs for machine learning
C4.5: programs for machine learning
Machine Learning
Theories for mutagenicity: a study in first-order and feature-based induction
Artificial Intelligence - Special volume on empirical methods
Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Machine Learning for the Detection of Oil Spills in Satellite Radar Images
Machine Learning - Special issue on applications of machine learning and the knowledge discovery process
Category learning through multimodality sensing
Neural Computation
MetaCost: a general method for making classifiers cost-sensitive
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Robust Classification for Imprecise Environments
Machine Learning
Database Management Systems
Database Systems: The Complete Book
Database Systems: The Complete Book
Machine Learning
ECML '93 Proceedings of the European Conference on Machine Learning
Learning Probabilistic Models of Relational Structure
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Adaptive View Validation: A First Step Towards Automatic View Detection
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Combining Labeled and Unlabeled Data for MultiClass Text Categorization
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Improving Minority Class Prediction Using Case-Specific Feature Weights
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Email classification with co-training
CASCON '01 Proceedings of the 2001 conference of the Centre for Advanced Studies on Collaborative research
Multi-relational data mining: an introduction
ACM SIGKDD Explorations Newsletter
Active learning with multiple views
Active learning with multiple views
CrossMine: Efficient Classification Across Multiple Database Relations
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Learning from imbalanced data sets with boosting and data generation: the DataBoost-IM approach
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Feature bagging for outlier detection
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Mining relational databases with multi-view learning
MRDM '05 Proceedings of the 4th international workshop on Multi-relational mining
"Missing Is Useful': Missing Values in Cost-Sensitive Decision Trees
IEEE Transactions on Knowledge and Data Engineering
Cost-sensitive learning with conditional Markov networks
ICML '06 Proceedings of the 23rd international conference on Machine learning
Test Strategies for Cost-Sensitive Decision Trees
IEEE Transactions on Knowledge and Data Engineering
An introduction to ROC analysis
Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
RELATIONAL DATA MINING AND ILP FOR DOCUMENT IMAGE UNDERSTANDING
Applied Artificial Intelligence
Thresholding for making classifiers cost-sensitive
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
Top-down induction of first-order logical decision trees
Artificial Intelligence
Estimating continuous distributions in Bayesian classifiers
UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
Hi-index | 0.00 |
Relational databases, with vast amounts of data¨Cfrom financial transactions, marketing surveys, medical records, to health informatics observations¨C and complex schemas, are ubiquitous in our society. Multirelational classification algorithms have been proposed to learn from such relational repositories, where multiple interconnected tables (relations) are involved. These methods search for relevant features both from a target relation (in which each tuple is associated with a class label) and relations related to the target, in order to better classify target relation tuples. However, in many practical database applications, such as credit card fraud detection and disease diagnosis, the target tuples are highly imbalanced. That is, the number of examples of one class (majority class) in the target relation is much higher than the others (minority classes). Many existing methods thus tend to produce poor predictive performance over the underrepresented class in the data. This paper presents a strategy to deal with such imbalanced multirelational data. The method learns from multiple views (feature sets) of relational data in order to construct view learners with different awareness of the imbalanced problem. These different observations possessed by multiple view learners are then combined, in order to yield a model which has better knowledge on both the majority and minority classes in a relational database. Experiments performed on six benchmarking data sets show that the proposed method achieves promising results when compared with other popular relational data mining algorithms, in terms of the ROC curve and AUC value obtained. In particular, an important result indicates that the method is superior when the class imbalanced is very high.