Credible classification for environmental problems

Authors:
Marco Zaffalon
Affiliations:
IDSIA, sull'Intelligenza Artificiale, Istituto Dalle Molle di Studi, Galleria 2, CH-6928 Manno (Lugano), Switzerland
Venue:
Environmental Modelling & Software
Year:
2005

Citing 7
Cited 5

Statistical analysis with missing data

Statistical analysis with missing data
Numerical recipes in C (2nd ed.): the art of scientific computing

Numerical recipes in C (2nd ed.): the art of scientific computing
Data mining: practical machine learning tools and techniques with Java implementations

Data mining: practical machine learning tools and techniques with Java implementations
Robust Bayes classifiers

Artificial Intelligence
Robust Learning with Missing Data

Machine Learning
Sequential Model Criticism in Probabilistic Expert Systems

IEEE Transactions on Pattern Analysis and Machine Intelligence
Reliable diagnoses of dementia by the naive credal classifier inferred from incomplete cognitive data

Artificial Intelligence in Medicine

Software, Data and Modelling News: JNCC2: An extension of naive Bayes classifier suited for small and incomplete data sets

Environmental Modelling & Software
A survey of the theory of coherent lower previsions

International Journal of Approximate Reasoning
Learning Reliable Classifiers From Small or Incomplete Data Sets: The Naive Credal Classifier 2

The Journal of Machine Learning Research
A nonparametric predictive alternative to the Imprecise Dirichlet Model: The case of a known number of categories

International Journal of Approximate Reasoning
Hybrid Bayesian network classifiers: Application to species distribution models

Environmental Modelling & Software

Quantified Score

Hi-index	0.00

Visualization

Abstract

Classifiers that aim at doing credible predictions should rely on carefully elicited prior knowledge. Often this is not available so they should start learning from data in condition of near-ignorance. This paper shows empirically, on an agricultural data set, that established methods of classification do not always adhere to this principle. Traditional ways to represent prior ignorance are shown to have an overwhelming weight compared to the information in the data, producing overconfident predictions. This point is crucial for problems, such as environmental ones, where prior knowledge is often scarce and even the data may not be known precisely. Credal classification, and in particular the naive credal classifier, is proposed as more faithful ways to cope with the ignorance problem. With credal classification, conditions of ignorance may limit the power of the inferences, not the credibility of the predictions.