COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Selective Sampling Using the Query by Committee Algorithm
Machine Learning
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Support Vector Machine Active Learning with Application sto Text Classification
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data
PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Mining Molecular Fragments: Finding Relevant Substructures of Molecules
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
gSpan: Graph-Based Substructure Pattern Mining
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Efficient Mining of Frequent Subgraphs in the Presence of Isomorphism
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Cyclic pattern kernels for predictive graph mining
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
A quickstart in frequent structure mining can make a difference
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Active learning using pre-clustering
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Graph indexing based on discriminative frequent structure analysis
ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2004
Active learning via transductive experimental design
ICML '06 Proceedings of the 23rd international conference on Machine learning
Mining significant graph patterns by leap search
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Hierarchical sampling for active learning
Proceedings of the 25th international conference on Machine learning
Direct mining of discriminative and essential frequent patterns via model-based search tree
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
ECML '07 Proceedings of the 18th European conference on Machine Learning
Effective multi-label active learning for text classification
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Identifying bug signatures using discriminative graph mining
Proceedings of the eighteenth international symposium on Software testing and analysis
COLT'07 Proceedings of the 20th annual conference on Learning theory
GAIA: graph classification using evolutionary computation
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Semi-supervised feature selection for graph classification
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
Measuring statistical dependence with hilbert-schmidt norms
ALT'05 Proceedings of the 16th international conference on Algorithmic Learning Theory
Hi-index | 0.00 |
Graph classification has become an important and active research topic in the last decade. Current research on graph classification focuses on mining discriminative subgraph features under supervised settings. The basic assumption is that a large number of labeled graphs are available. However, labeling graph data is quite expensive and time consuming for many real-world applications. In order to reduce the labeling cost for graph data, we address the problem of how to select the most important graph to query for the label. This problem is challenging and different from conventional active learning problems because there is no predefined feature vector. Moreover, the subgraph enumeration problem is NP-hard. The active sample selection problem and the feature selection problem are correlated for graph data. Before we can solve the active sample selection problem, we need to find a set of optimal subgraph features. To address this challenge, we demonstrate how one can simultaneously estimate the usefulness of a query graph and a set of subgraph features. The idea is to maximize the dependency between subgraph features and graph labels using an active learning framework. We propose a branch-and-bound algorithm to search for the optimal query graph and optimal features simultaneously. Empirical studies on nine real-world tasks demonstrate that the proposed method can obtain better accuracy on graph data than alternative approaches.