Enhanced hypertext categorization using hyperlinks
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Approximate nearest neighbors: towards removing the curse of dimensionality
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Learning probabilistic models of link structure
The Journal of Machine Learning Research
Learning relational probability trees
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Learning from labeled and unlabeled data on a directed graph
ICML '05 Proceedings of the 22nd international conference on Machine learning
The political blogosphere and the 2004 U.S. election: divided they blog
Proceedings of the 3rd international workshop on Link discovery
Linear prediction models with graph regularization for web-page categorization
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Demographic prediction based on user's browsing behavior
Proceedings of the 16th international conference on World Wide Web
Age and geographic inferences of the LiveJournal social network
ICML'06 Proceedings of the 2006 conference on Statistical network analysis
Discriminative probabilistic models for relational data
UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
WebKDD/SNAKDD 2007: web mining and social network analysis post-workshop report
ACM SIGKDD Explorations Newsletter - Special issue on visual analytics
What blogs tell us about websites: a demographics study
Proceedings of the fourth ACM international conference on Web search and data mining
Mining competitor relationships from online news: A network-based approach
Electronic Commerce Research and Applications
On supervised mining of dynamic content-based networks1
Statistical Analysis and Data Mining
BlurMe: inferring and obfuscating user gender based on ratings
Proceedings of the sixth ACM conference on Recommender systems
Hi-index | 0.00 |
In analyzing data from social and communication networks, we encounter the problem of classifying objects where there is an explicit link structure amongst the objects. We study the problem of inferring the classification of all the objects from a labeled subset, using only the link-based information amongst the objects. We abstract the above as a labeling problem on multigraphs with weighted edges. We present two classes of algorithms, based on local and global similarities. Then we focus on multigraphs induced by blog data, and carefully apply our general algorithms to specifically infer labels such as age, gender and location associated with the blog based only on the link-structure amongst them. We perform a comprehensive set of experiments with real, large-scale blog data sets and show that significant accuracy is possible from little or no non-link information, and our methods scale to millions of nodes and edges.