Enhanced hypertext categorization using hyperlinks
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Authoritative sources in a hyperlinked environment
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 11th international conference on World Wide Web
Automating the Construction of Internet Portals with Machine Learning
Information Retrieval
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Learning to Create Customized Authority Lists
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Algorithms for estimating relative importance in networks
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
Classification in Networked Data: A Toolkit and a Univariate Case Study
The Journal of Machine Learning Research
Discriminative probabilistic models for relational data
UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Hi-index | 0.00 |
The analysis of large and complex networks, or graphs, is becoming increasingly important in many scientific areas including machine learning, social network analysis and bioinformatics. One natural type of question that can be asked in network analysis is “Given two sets R and T of individuals in a graph with complete and missing knowledge, respectively, about a property of interest, which individuals in T are closest to R with respect to this property?”. To answer this question, we can rank the individuals in T such that the individuals ranked highest are most likely to exhibit the property of interest. Several methods based on weighted paths in the graph and Markov chain models have been proposed to solve this task. In this paper, we show that we can improve previously published approaches by rephrasing this problem as the task of property prediction in graph-structured data from positive examples, the individuals in R, and unlabelled data, the individuals in T, and applying an inexpensive iterative neighbourhood's majority vote based prediction algorithm (“iNMV”) to this task. We evaluate our iNMV prediction algorithm and two previously proposed methods using Markov chains on three real world graphs in terms of ROC AUC statistic. iNMV obtains rankings that are either significantly better or not significantly worse than the rankings obtained from the more complex Markov chain based algorithms, while achieving a reduction in run time of one order of magnitude on large graphs.