Enhanced hypertext categorization using hyperlinks
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Approximate nearest neighbors: towards removing the curse of dimensionality
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Learning probabilistic models of link structure
The Journal of Machine Learning Research
Learning relational probability trees
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Learning from labeled and unlabeled data on a directed graph
ICML '05 Proceedings of the 22nd international conference on Machine learning
The political blogosphere and the 2004 U.S. election: divided they blog
Proceedings of the 3rd international workshop on Link discovery
Linear prediction models with graph regularization for web-page categorization
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Demographic prediction based on user's browsing behavior
Proceedings of the 16th international conference on World Wide Web
Age and geographic inferences of the LiveJournal social network
ICML'06 Proceedings of the 2006 conference on Statistical network analysis
Discriminative probabilistic models for relational data
UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Classification and annotation in social corpora using multiple relations
Proceedings of the 20th ACM international conference on Information and knowledge management
Classifying online social network users through the social graph
FPS'12 Proceedings of the 5th international conference on Foundations and Practice of Security
CALA: An unsupervised URL-based web page classification system
Knowledge-Based Systems
Hi-index | 0.00 |
In analyzing data from social and communication networks, we encounter the problem of classifying objects where there is explicit link structure amongst the objects. We study the problem of inferring the classification of all the objects from a labeled subset, using only link-based information between objects. We abstract the above as a labeling problem on multigraphs with weighted edges. We present two classes of algorithms, based on local and global similarities. Then we focus on multigraphs induced by blog data, and carefully apply our general algorithms to specifically infer labels such as age, gender and location associated with the blog based only on the link-structure amongst them. We perform a comprehensive set of experiments with real, large-scale blog data sets and show that significant accuracy is possible from little or no non-link information, and our methods scale to millions of nodes and edges.