Introduction to the theory of neural computation
Introduction to the theory of neural computation
Wrappers for feature subset selection
Artificial Intelligence - Special issue on relevance
Feature Selection for Knowledge Discovery and Data Mining
Feature Selection for Knowledge Discovery and Data Mining
A Tutorial on Support Vector Machines for Pattern Recognition
Data Mining and Knowledge Discovery
A Nonlinear Mapping for Data Structure Analysis
IEEE Transactions on Computers
Environmental Modelling & Software
The feature selection problem: traditional methods and a new algorithm
AAAI'92 Proceedings of the tenth national conference on Artificial intelligence
Environmental Modelling & Software
Environmental Modelling & Software
Predicting the potential habitat of oaks with data mining models and the R system
Environmental Modelling & Software
Prediction of sea surface temperature in the tropical Atlantic by support vector machines
Computational Statistics & Data Analysis
Hi-index | 0.00 |
This paper reports on a successful application of statistical and inductive learning methods to determine optimal discriminating parameters and develop predictive models for the determination of faecal sources in waters, recently and heavily polluted with wastewaters (microbial source tracking). The data comes from an international study in which various microbial and chemical parameters were determined in heavily polluted waters from diverse geographical areas. A total of 38 variables derived from the microbial and chemical parameters were defined to characterise the available 103 observations. Four methods were evaluated: Euclidean k-nearest-neighbour, linear Bayesian classifier, quadratic Bayesian classifier and a support vector machine. The main aim was the obtention of highly accurate predictive models using the lowest number of variables possible. After a strong feature selection process, the obtained results show that predictive models using only two variables emerge with 100% correct classification. The obtained solutions make use of a linear combination of a discriminating tracer (the enumeration of phages infecting Bacteroides thetaiotaomicron) and a universal non-discriminant faecal indicator. Other models not using the discriminant tracer were developed, though a higher number of variables was needed to achieve a high rate of correct classification.