Statistical methods for speech recognition
Statistical methods for speech recognition
Diffusion Kernels on Graphs and Other Discrete Input Spaces
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Learning from Labeled and Unlabeled Data using Graph Mincuts
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Kernel Methods for Pattern Analysis
Kernel Methods for Pattern Analysis
Automatic multimedia cross-modal correlation discovery
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Application of kernels to link analysis
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Predictive low-rank decomposition for kernel methods
ICML '05 Proceedings of the 22nd international conference on Machine learning
Learning from labeled and unlabeled data on a directed graph
ICML '05 Proceedings of the 22nd international conference on Machine learning
An Experimental Investigation of Graph Kernels on a Collaborative Recommendation Task
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Fast Random Walk with Restart and Its Applications
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
The link-prediction problem for social networks
Journal of the American Society for Information Science and Technology
Classification in Networked Data: A Toolkit and a Univariate Case Study
The Journal of Machine Learning Research
IEEE Transactions on Knowledge and Data Engineering
Fast direction-aware proximity for graph mining
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Random walk with restart: fast solutions and applications
Knowledge and Information Systems
Fast incremental proximity search in large graphs
Proceedings of the 25th international conference on Machine learning
Graph transduction via alternating minimization
Proceedings of the 25th international conference on Machine learning
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Semi-supervised Classification from Discriminative Random Walks
ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
Graph nodes clustering with the sigmoid commute-time kernel: A comparative study
Data & Knowledge Engineering
Linear Neighborhood Propagation and Its Applications
IEEE Transactions on Pattern Analysis and Machine Intelligence
Randomized shortest-path problems: Two related models
Neural Computation
Affinity measures based on the graph Laplacian
TextGraphs-3 Proceedings of the 3rd Textgraphs Workshop on Graph-Based Algorithms for Natural Language Processing
Introduction to Semi-Supervised Learning
Introduction to Semi-Supervised Learning
Graph nodes clustering based on the commute-time kernel
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
The Sum-over-Paths Covariance Kernel: A Novel Covariance Measure between Nodes of a Directed Graph
IEEE Transactions on Pattern Analysis and Machine Intelligence
Semi-Supervised Learning
Review of statistical network analysis: models, algorithms, and software
Statistical Analysis and Data Mining
A link-analysis-based discriminant analysis for exploring partially labeled graphs
Pattern Recognition Letters
Aggregation pheromone metaphor for semi-supervised classification
Pattern Recognition
A second order cone programming approach for semi-supervised learning
Pattern Recognition
Hi-index | 0.01 |
This work addresses graph-based semi-supervised classification and betweenness computation in large, sparse, networks (several millions of nodes). The objective of semi-supervised classification is to assign a label to unlabeled nodes using the whole topology of the graph and the labeling at our disposal. Two approaches are developed to avoid explicit computation of pairwise proximity between the nodes of the graph, which would be impractical for graphs containing millions of nodes. The first approach directly computes, for each class, the sum of the similarities between the nodes to classify and the labeled nodes of the class, as suggested initially in [1,2]. Along this approach, two algorithms exploiting different state-of-the-art kernels on a graph are developed. The same strategy can also be used in order to compute a betweenness measure. The second approach works on a trellis structure built from biased random walks on the graph, extending an idea introduced in [3]. These random walks allow to define a biased bounded betweenness for the nodes of interest, defined separately for each class. All the proposed algorithms have a linear computing time in the number of edges while providing good results, and hence are applicable to large sparse networks. They are empirically validated on medium-size standard data sets and are shown to be competitive with state-of-the-art techniques. Finally, we processed a novel data set, which is made available for benchmarking, for multi-class classification in a large network: the U.S. patents citation network containing 3M nodes (of six different classes) and 38M edges. The three proposed algorithms achieve competitive results (around 85% classification rate) on this large network-they classify the unlabeled nodes within a few minutes on a standard workstation.