On power-law relationships of the Internet topology
Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication
IEEE Intelligent Systems
Learning probabilistic models of link structure
The Journal of Machine Learning Research
Maximizing the spread of influence through a social network
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Prospects and challenges for multi-relational data mining
ACM SIGKDD Explorations Newsletter
Link mining: a new data mining challenge
ACM SIGKDD Explorations Newsletter
Graph-based relational learning: current and future directions
ACM SIGKDD Explorations Newsletter
Why collective inference improves relational classification
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Graphs over time: densification laws, shrinking diameters and possible explanations
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Group and topic discovery from relations and text
Proceedings of the 3rd international workshop on Link discovery
A latent mixed membership model for relational data
Proceedings of the 3rd international workshop on Link discovery
Probabilistic classification and clustering in relational data
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Link analysis, eigenvectors and stability
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
The web as a graph: measurements, models, and methods
COCOON'99 Proceedings of the 5th annual international conference on Computing and combinatorics
ACM SIGKDD Explorations Newsletter
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Measuring the effects of preprocessing decisions and network forces in dynamic network analysis
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Robustness of centrality measures under uncertainty: Examining the role of network topology
Computational & Mathematical Organization Theory
A Survey of Statistical Network Models
Foundations and Trends® in Machine Learning
Distance distribution and average shortest path length estimation in real-world networks
ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
Network sampling and classification: An investigation of network model representations
Decision Support Systems
Analyzing scientific networks for nuclear capabilities assessment
Journal of the American Society for Information Science and Technology
Bridge analysis in a Social Internetworking Scenario
Information Sciences: an International Journal
Hi-index | 0.00 |
In a time of information glut, observations about complex systems and phenomena of interest are available in several applications areas, such as biology and text. As a consequence, scientists have started searching for patterns that involve interactions among the objects of analysis, to the effect that research on models and algorithms for network analysis has become a central theme for knowledge discovery and data mining (KDD). The intuitions behind the plethora of approaches rely upon few basic types of networks, identified by specific local and global topological properties, which we term "pure" topology types.In this paper, (1) we survey pure topology types along with existing sampling algorithms that generate them, (2) we introduce novel algorithms that enhance the diversity of samples, and address the case of cellular topologies, (3) we perform statistical studies of the stability of the properties of pure types to alternative generative algorithms, and a joint study of the separability of pure types, in terms of their embedding in a space of metrics for network analysis, widely adopted in the social and physical sciences.We conclude with a word of caution to the practitioners, who sample pure topology types to assess the "statistical significance" of their findings, e.g., the p-value of the clustering coefficient is sensitive to the sampling algorithm used. We find that different pure types share similar topological properties. Further, real world networks hardly present the variability profile of a single pure type. We suggest the assumption of "mixtures of types" as an alternative starting point for developing models and algorithms for network analysis.