Introduction to Probability and Statistics: Principles and Applications for Engineering and the Computing Sciences
Distance Measures for Effective Clustering of ARIMA Time-Series
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Efficient Time Series Matching by Wavelets
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Similarity Search Over Time-Series Data Using Wavelets
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Active Feature-Value Acquisition for Classifier Induction
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Middleware: Middleware Challenges and Approaches for Wireless Sensor Networks
IEEE Distributed Systems Online
Pattern Recognition and Machine Learning (Information Science and Statistics)
Pattern Recognition and Machine Learning (Information Science and Statistics)
Bandit-Based Algorithms for Budgeted Learning
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Clustering of time series data-a survey
Pattern Recognition
Budgeted learning of nailve-bayes classifiers
UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
The design space of wireless sensor networks
IEEE Wireless Communications
Hi-index | 0.00 |
Widespread use of GPS devices and ubiquity of remotely sensed geospatial images along with cheap storage devices have resulted in vast amounts of digital data. More recently, with the advent of wireless technology, a large number of sensor networks have been deployed to monitor many human, biological and natural processes. This poses a challenge in many data rich application domains now: how to best choose the datasets to solve specific problems? In particular, some of the datasets may be redundant and their inclusion in analysis may not only be time consuming, but also lead to erroneous conclusions. On the other hand, excluding some of the datasets hastily might skew the observations drawn. We propose the concept of data support as the basis for efficient, cost-effective and intelligent use of geospatial data in order to reduce uncertainty in the analysis and consequently in the results. Data support is defined as the process of determining the information utility of a data source to help decide which one to include or exclude to improve cost-effectiveness in existing data analysis. In this paper we use mutual information--a concept popular in information theory as a measure to compute information gain or loss between two datasets--as the basis of computing data support. The flexibility and effectiveness of the approach are demonstrated using an application in the hydrological analysis domain, specifically, watersheds in the state of Nebraska.