Association analysis-based transformations for protein interaction networks: a function prediction case study

Authors:
Gaurav Pandey;Michael Steinbach;Rohit Gupta;Tushar Garg;Vipin Kumar
Affiliations:
University of Minnesota;University of Minnesota;University of Minnesota;University of Minnesota;University of Minnesota
Venue:
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2007

Citing 13
Cited 4

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Fast discovery of association rules

Advances in knowledge discovery and data mining
Alternative Interest Measures for Mining Associations in Databases

IEEE Transactions on Knowledge and Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Generalizing the notion of support

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
A Topological Measurement for Weighted Protein Interaction Network

CSB '05 Proceedings of the 2005 IEEE Computational Systems Bioinformatics Conference
Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps

Bioinformatics
Introduction to Data Mining, (First Edition)

Introduction to Data Mining, (First Edition)
Enhancing Data Analysis with Noise Removal

IEEE Transactions on Knowledge and Data Engineering
Hyperclique pattern discovery

Data Mining and Knowledge Discovery
Prediction of Protein Function Using Common-Neighbors in Protein-Protein Interaction Networks

BIBE '06 Proceedings of the Sixth IEEE Symposium on BionInformatics and BioEngineering
Effective similarity measures for expression profiles

Bioinformatics
Clustering Using a Similarity Measure Based on Shared Near Neighbors

IEEE Transactions on Computers

Association Analysis Techniques for Bioinformatics Problems

BICoB '09 Proceedings of the 1st International Conference on Bioinformatics and Computational Biology
Mining High-Correlation Association Rules for Inferring Gene Regulation Networks

DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
Reliability study of mesh networks modeled as random graphs

MMES'10 Proceedings of the 2010 international conference on Mathematical models for engineering science
Random spanning trees and the prediction ofweighted graphs

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Protein interaction networks are one of the most promising types of biological data for the discovery of functional modules and the prediction of individual protein functions. However, it is known that these networks are both incomplete and inaccurate, i.e., they have spurious edges and lackbiologically valid edges. One way to handle this problem is by transforming the original interaction graph into new graphs that remove spurious edges, add biologically valid ones, and assign reliability scores to the edges constituting the final network. We investigate currently existing methods, as well as propose a robust association analysis-based method for this task. This method is based on the concept of h-confidence, which is a measure that can be used to extract groups of objects having high similarity with each other. Experimental evaluation on several protein interaction data sets show that hyperclique-based transformations enhance the performance of standard function prediction algorithms significantly, and thus have merit.