Approximating the Number of Network Motifs
WAW '09 Proceedings of the 6th International Workshop on Algorithms and Models for the Web-Graph
Hash Kernels for Structured Data
The Journal of Machine Learning Research
A benchmark diagnostic model generation system
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans - Special issue on model-based diagnostics
Extended dynamic subgraph statistics using h-index parameterized data structures
COCOA'10 Proceedings of the 4th international conference on Combinatorial optimization and applications - Volume Part I
Constructing social networks from unstructured group dialog in virtual worlds
SBP'11 Proceedings of the 4th international conference on Social computing, behavioral-cultural modeling and prediction
Ranking differential genes in co-expression networks
Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Vertex collocation profiles: subgraph counting for link analysis and prediction
Proceedings of the 21st international conference on World Wide Web
Querying subgraph sets with g-tries
DBSocial '12 Proceedings of the 2nd ACM SIGMOD Workshop on Databases and Social Networks
Tutorial on biological networks
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Extended dynamic subgraph statistics using h-index parameterized data structures
Theoretical Computer Science
GRAFT: an approximate graphlet counting algorithm for large graph analysis
Proceedings of the 21st ACM international conference on Information and knowledge management
Degree relations of triangles in real-world networks and graph models
Proceedings of the 21st ACM international conference on Information and knowledge management
Comparison of Co-authorship Networks across Scientific Fields Using Motifs
ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
Classifying Wikipedia articles using network motif counts and ratios
Proceedings of the Eighth Annual International Symposium on Wikis and Open Collaboration
Link prediction in human mobility networks
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Characterizing the Topology of Probabilistic Biological Networks
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
G-Tries: a data structure for storing and finding subgraphs
Data Mining and Knowledge Discovery
Hi-index | 3.84 |
Motivation: Analogous to biological sequence comparison, comparing cellular networks is an important problem that could provide insight into biological understanding and therapeutics. For technical reasons, comparing large networks is computationally infeasible, and thus heuristics, such as the degree distribution, clustering coefficient, diameter, and relative graphlet frequency distribution have been sought. It is easy to demonstrate that two networks are different by simply showing a short list of properties in which they differ. It is much harder to show that two networks are similar, as it requires demonstrating their similarity in all of their exponentially many properties. Clearly, it is computationally prohibitive to analyze all network properties, but the larger the number of constraints we impose in determining network similarity, the more likely it is that the networks will truly be similar. Results: We introduce a new systematic measure of a network's local structure that imposes a large number of similarity constraints on networks being compared. In particular, we generalize the degree distribution, which measures the number of nodes 'touching' k edges, into distributions measuring the number of nodes 'touching' k graphlets, where graphlets are small connected non-isomorphic subgraphs of a large network. Our new measure of network local structure consists of 73 graphlet degree distributions of graphlets with 2--5 nodes, but it is easily extendible to a greater number of constraints (i.e. graphlets), if necessary, and the extensions are limited only by the available CPU. Furthermore, we show a way to combine the 73 graphlet degree distributions into a network 'agreement' measure which is a number between 0 and 1, where 1 means that networks have identical distributions and 0 means that they are far apart. Based on this new network agreement measure, we show that almost all of the 14 eukaryotic PPI networks, including human, resulting from various high-throughput experimental techniques, as well as from curated databases, are better modeled by geometric random graphs than by Erdös--Rény, random scale-free, or Barabási--Albert scale-free networks. Availability: Software executables are available upon request. Contact: natasha@ics.uci.edu