Learning routing queries in a query zone
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Wrappers for feature subset selection
Artificial Intelligence - Special issue on relevance
An introduction to variable and feature selection
The Journal of Machine Learning Research
Feature selection for text categorization on imbalanced data
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Efficient Feature Selection via Analysis of Relevance and Redundancy
The Journal of Machine Learning Research
On the Class Imbalance Problem
ICNC '08 Proceedings of the 2008 Fourth International Conference on Natural Computation - Volume 04
Online Feature Selection with Streaming Features
IEEE Transactions on Pattern Analysis and Machine Intelligence
Hi-index | 0.00 |
The development of disastrous flood forecasting techniques able to provide warnings at a long lead-time (5-15 days) is of great importance to society. Extreme Flood is usually a consequence of a sequence of precipitation events occurring over from several days to several weeks. Though precise short-term forecasting the magnitude and extent of individual precipitation event is still beyond our reach, long-term forecasting of precipitation clusters can be attempted by identifying persistent atmospheric regimes that are conducive for the precipitation clusters. However, such forecasting will suffer from overwhelming number of relevant features and high imbalance of sample sets. In this paper, we propose an integrated data mining framework for identifying the precursors to precipitation event clusters and use this information to predict extended periods of extreme precipitation and subsequent floods. We synthesize a representative feature set that describes the atmosphere motion, and apply a streaming feature selection algorithm to online identify the precipitation precursors from the enormous feature space. A hierarchical re-sampling approach is embedded in the framework to deal with the imbalance problem. An extensive empirical study is conducted on historical precipitation and associated flood data collected in the State of Iowa. Utilizing our framework a few physically meaningful precipitation cluster precursor sets are identified from millions of features. More than 90% of extreme precipitation events are captured by the proposed prediction model using precipitation cluster precursors with a lead time of more than 5 days.