Quality of information-based source assessment and selection

Authors:
Yaojin Lin;Xuegang Hu;Xindong Wu
Affiliations:
-;-;-
Venue:
Neurocomputing
Year:
2014

Citing 37
Cited 0

On Combining Classifiers

IEEE Transactions on Pattern Analysis and Machine Intelligence
Synthesizing High-Frequency Rules from Different Data Sources

IEEE Transactions on Knowledge and Data Engineering
Database classification for multi-database mining

Information Systems
Semisupervised learning from different information sources

Knowledge and Information Systems
Efficient Classification across Multiple Database Relations: A CrossMine Approach

IEEE Transactions on Knowledge and Data Engineering
Spectral feature selection for supervised and unsupervised learning

Proceedings of the 24th international conference on Machine learning
Synthesizing heavy association rules from different real data sources

Pattern Recognition Letters
Top 10 algorithms in data mining

Knowledge and Information Systems
Stable feature selection via dense feature groups

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Design and evaluation of a hybrid sensor network for cane toad monitoring

ACM Transactions on Sensor Networks (TOSN)
Mining globally interesting patterns from multiple databases using kernel estimation

Expert Systems with Applications: An International Journal
Heterogeneous source consensus learning via decision propagation and negotiation

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Multiple information sources cooperative learning

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Cross-Domain Learning from Multiple Sources: A Consensus Regularization Perspective

IEEE Transactions on Knowledge and Data Engineering
Multiview spectral embedding

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Classifier and Cluster Ensembles for Mining Concept Drifting Data Streams

ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
A novel ensemble construction method for multi-view data using random cross-view correlation between within-class examples

Pattern Recognition
Measuring relevance between discrete and continuous features based on neighborhood mutual information

Expert Systems with Applications: An International Journal
Shell-neighbor method and its application in missing data imputation

Applied Intelligence
View determinacy for preserving selected information in data transformations

Information Systems
CLAP: Collaborative pattern mining for distributed information systems

Decision Support Systems
Improving data quality by source analysis

Journal of Data and Information Quality (JDIQ)
Image classification by multimodal subspace learning

Pattern Recognition Letters
Clustering in applications with multiple data sources-A mutual subspace clustering approach

Neurocomputing
Incremental Detection of Inconsistencies in Distributed Data

ICDE '12 Proceedings of the 2012 IEEE 28th International Conference on Data Engineering
m-SNE: Multiview Stochastic Neighbor Embedding

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Nearest neighbor selection for iteratively kNN imputation

Journal of Systems and Software
Pairwise constraints based multiview features fusion for scene classification

Pattern Recognition
Neighborhood effective information ratio for hybrid feature subset evaluation and selection

Neurocomputing
Divergence-based feature selection for separate classes

Neurocomputing
Feature selection for high-dimensional imbalanced data

Neurocomputing
Quality of Information Based Data Selection and Transmission in Wireless Sensor Networks

RTSS '12 Proceedings of the 2012 IEEE 33rd Real-Time Systems Symposium
On Similarity Preserving Feature Selection

IEEE Transactions on Knowledge and Data Engineering
Transfer across Completely Different Feature Spaces via Spectral Embedding

IEEE Transactions on Knowledge and Data Engineering
Large-margin multi-view Gaussian process for image classification

Proceedings of the Fifth International Conference on Internet Multimedia Computing and Service
Mining stable patterns in multiple correlated databases

Decision Support Systems
Multiview Hessian discriminative sparse coding for image annotation

Computer Vision and Image Understanding

Quantified Score

Hi-index	0.01

Visualization

Abstract

Multiple information sources for the same set of objects can provide different representations, and combining their advantages may improve the predictive power for a given task. However, it is noticeable that some sources might be irrelevant or redundant. Thus, it is meaningful to select a set of good information sources that could help improve the learning performance, and very little work has been reported on this topic. In this paper, we first identify the two aspects of quality of information, source significance and source redundancy. In particular, significance represents the degree to which an information source contributes to the classification, and redundancy implies the information overlap among different information sources. We then propose a metric that combines neighborhood mutual information with a Max-Significance-Min-Redundancy algorithm, allowing us to select a compact set of superior information sources for classification learning. Extensive experiments show that the metric is very helpful in finding good information sources, and that the proposed method outperforms many other methods.