Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Learning probabilistic models of link structure
The Journal of Machine Learning Research
The link prediction problem for social networks
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Machine Learning
ACM SIGKDD Explorations Newsletter
Graph evolution: Densification and shrinking diameters
ACM Transactions on Knowledge Discovery from Data (TKDD)
Collective entity resolution in relational data
ACM Transactions on Knowledge Discovery from Data (TKDD)
Combining Collective Classification and Link Prediction
ICDMW '07 Proceedings of the Seventh IEEE International Conference on Data Mining Workshops
Structured entity identification and document categorization: two tasks with one joint model
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Effective label acquisition for collective classification
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Joint extraction of entities and relations for opinion recognition
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Cautious inference in collective classification
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Discriminative probabilistic models for relational data
UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Hi-index | 0.00 |
There is a growing wealth of data describing networks of various types, including social networks, physical networks such as transportation or communication networks, and biological networks. At the same time, there is a growing interest in analyzing these networks, in order to uncover general laws that govern their structure and evolution, and patterns and predictive models to develop better policies and practices. However, a fundamental challenge in dealing with this newly available observational data describing networks is that the data is often of dubious quality -- it is noisy and incomplete -- and before any analysis method can be applied, the data must be cleaned, and missing information inferred. In this paper, we introduce the notion of graph identification, which explicitly models the inference of a "cleaned" output network from a noisy input graph. It is this output network that is appropriate for further analysis. We present an illustrative example and use the example to explore the types of inferences involved in graph identification, as well as the challenges and issues involved in combining those inferences. We then present a simple, general approach to combining the inferences in graph identification and experimentally show the utility of our combined approach and how the performance of graph identification is sensitive to the inter-dependencies among these inferences.