A dynamic adaptive sampling algorithm (DASA) for real world applications: finger print recognition and face recognition

  • Authors:
  • Ashwin Satyanarayana;Ian Davidson

  • Affiliations:
  • Department of Computer Science, State University of New York, Albany, Albany, NY;Department of Computer Science, State University of New York, Albany, Albany, NY

  • Venue:
  • ISMIS'05 Proceedings of the 15th international conference on Foundations of Intelligent Systems
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In many real world problems, data mining algorithms have access to massive amounts of data (defense and security). Mining all the available data is prohibitive due to computational (time and memory) constraints. Thus, the smallest sufficient training set size that obtains the same accuracy as the entire available dataset remains an important research question. Progressive sampling randomly selects an initial small sample and increases the sample size using either geometric or arithmetic series until the error converges, with the sampling schedule determined apriori. In this paper, we explore sampling schedules that are adaptive to the dataset under consideration. We develop a general approach to determine how many instances are required at each iteration for convergence using Chernoff Inequality. We try our approach on two real world problems where data is abundant: face recognition and finger print recognition using neural networks. Our empirical results show that our dynamic approach is faster and uses much fewer examples than other existing methods. However, the use of Chernoff bound requires the samples at each iteration to be independent of each other. Future work will look at removing this limitation which should further improve performance.