Efficient Learning from Massive Spatial-Temporal Data through Selective Support Vector Propagation

  • Authors:
  • Yilian Qin;Zoran Obradovic

  • Affiliations:
  • Temple University, USA, email: zoran@ist.temple.edu;Temple University, USA, email: zoran@ist.temple.edu

  • Venue:
  • Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the proposed approach, learning from large spatial-temporal data streams is addressed using the sequential training of support vector machines (SVM) on a series of smaller spatial data subsets collected over shorter periods. A set of representatives are selected from support vectors corresponding to an SVM trained with data of a limited spatial-temporal coverage. These representatives are merged with newly arrived data also corresponding to a limited spacetime segment. A new SVM is learned using both sources. Relying on selected representatives instead of propagating all support vectors to the next iteration allows efficient learning of semi-global SVMs in a non-stationary series consisting of correlated spatial datasets. The proposed method is evaluated on a challenging geoinformatics problem of aerosol retrieval from Terra satellite based Multi-angle Imaging Spectro Radiometer instrument. Regional features were discovered that allowed spatial partitioning of continental US to several semi-global regions. Developed semi-global SVM models were reused for efficient estimation of aerosol optical depth from radiances with a high level of accuracy on data cycles spanning several months. The obtained results provide evidence that SVMs trained as proposed have an extended spatial and temporal range of applicability as compared to SVM models trained on samples collected over shorter periods. In addition, the computational cost of training a semi-global SVM with selective support vector propagation (SSVP) was much lower than when training a global model using spatial observations from the entire period.