Employing linear regression in regression tree leaves
ECAI '92 Proceedings of the 10th European conference on Artificial intelligence
Spatial Subgroup Discovery Applied to the Analysis of Vegetation Data
PAKM '02 Proceedings of the 4th International Conference on Practical Aspects of Knowledge Management
Visual discovery and reconstruction of the climatic conditions of the past
ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part III
Hi-index | 0.05 |
Exploratory data mining, machine learning, and statistical modeling all have a role in discovery science. We describe a paleoecological reconstruction problem where Bayesian methods are useful and allow plausible inferences from the small and vague data sets available. Paleoecological reconstruction aims at estimating temperatures in the past. Knowledge about present day abundances of certain species are combined with data about the same species in fossil assemblages (e.g., lake sediments). Stated formally, the reconstruction task has the form of a typical machine learning problem. However, to obtain useful predictions, a lot of background knowledge about ecological variation is needed. In paleoecological literature the statistical methods are involved variations of regression. We compare these methods with regression trees, nearest neighbor methods, and Bayesian hierarchical models. All the methods achieve about the same prediction accuracy on modern specimens, but the Bayesian methods and the involved regression methods seem to yield the best reconstructions. The advantage of the Bayesian methods is that they also give good estimates on the variability of the reconstructions.