Using classifier ensembles to label spatially disjoint data

  • Authors:
  • Larry Shoemaker;Robert E. Banfield;Lawrence O. Hall;Kevin W. Bowyer;W. Philip Kegelmeyer

  • Affiliations:
  • Department of Computer Science and Engineering, ENB118, University of South Florida, 4202 E. Fowler Avenue Tampa, FL 33620-9951, USA;Department of Computer Science and Engineering, ENB118, University of South Florida, 4202 E. Fowler Avenue Tampa, FL 33620-9951, USA;Department of Computer Science and Engineering, ENB118, University of South Florida, 4202 E. Fowler Avenue Tampa, FL 33620-9951, USA;Department of Computer Science and Engineering, University of Notre Dame, South Bend, IN 46556, USA;Sandia National Laboratories, Computational Science and Math Research Department, P.O. Box 969, MS 9159 Livermore, CA 94551-0969, USA

  • Venue:
  • Information Fusion
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe an ensemble approach to learning from arbitrarily partitioned data. The partitioning comes from the distributed processing requirements of a large scale simulation. The volume of the data is such that classifiers can train only on data local to a given partition. As a result of the partition reflecting the needs of the simulation, the class statistics can vary from partition to partition. Some classes will likely be missing from some partitions. We combine a fast ensemble learning algorithm with probabilistic majority voting in order to learn an accurate classifier from such data. Results from simulations of an impactor bar crushing a storage canister and from facial feature recognition show that regions of interest are successfully identified in spite of the class imbalance in the individual training sets.