Empirical methods for artificial intelligence
Empirical methods for artificial intelligence
Handbook of mathematics (3rd ed.)
Handbook of mathematics (3rd ed.)
Discovering data mining: from concept to implementation
Discovering data mining: from concept to implementation
Efficient progressive sampling
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Data Mining Techniques: For Marketing, Sales, and Customer Support
Data Mining Techniques: For Marketing, Sales, and Customer Support
Protecting medical data for decision-making analyses
Journal of Medical Systems - Special issue: Computer-based medical systems
UMass/Hughes: description of the CIRCUS system used for MUC-5
MUC5 '93 Proceedings of the 5th conference on Message understanding
Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)
Ensemble-based regression analysis of multimodal medical data for osteopenia diagnosis
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
One of the tasks of data mining is classification, which provides a mapping from attributes (observations) to pre-specified classes. Classification models are built by using underlying data. In principle, the models built with more data yield better results. However, the relationship between the available data and the performance is not well understood, except that the accuracy of a classification model has diminishing improvements as a function of data size. In this paper, we present an approach for an early assessment of the extracted knowledge (classification models) in the terms of performance (accuracy), based on the amount of data used. The assessment is based on the observation of the performance on smaller sample sizes. The solution is formally defined and used in an experiment. In experiments we show the correctness and utility of the approach.