It's who you know: graph mining using recursive structural features

Authors:
Keith Henderson;Brian Gallagher;Lei Li;Leman Akoglu;Tina Eliassi-Rad;Hanghang Tong;Christos Faloutsos
Affiliations:
Lawrence Livermore National Laboratory, Livermore, CA, USA;Lawrence Livermore National Laboratory, Livermore, CA, USA;Carnegie Mellon University, Pittsburgh, PA, USA;Carnegie Mellon University, Pittsburgh, PA, USA;Rutgers University, Piscataway, NJ, USA;IBM Watson, Hawthorne, NY, USA;Carnegie Mellon University, Pittsburgh, PA, USA
Venue:
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2011

Citing 21
Cited 8

On power-law relationships of the Internet topology

Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication
Self-Organization and Identification of Web Communities

Computer
Graph-based anomaly detection

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Graphs over time: densification laws, shrinking diameters and possible explanations

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Collective multi-label classification

Proceedings of the 14th ACM international conference on Information and knowledge management
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Boosting for transfer learning

Proceedings of the 24th international conference on Machine learning
Learning a meta-level prior for feature relevance from multiple related tasks

Proceedings of the 24th international conference on Machine learning
Using ghost edges for classification in sparsely labeled networks

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Knowledge transfer via multiple model local structure mapping

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Structure feature selection for graph classification

Proceedings of the 17th ACM conference on Information and knowledge management
Graph-based transfer learning

Proceedings of the 18th ACM conference on Information and knowledge management
The web as a graph: measurements, models, and methods

COCOON'99 Proceedings of the 5th annual international conference on Computing and combinatorics
Metric forensics: a multi-level approach for mining volatile graphs

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
New perspectives and methods in link prediction

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Collective cross-document relation extraction without labelled data

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Leveraging label-independent features for classification in sparsely labeled networks: an empirical study

SNAKDD'08 Proceedings of the Second international conference on Advances in social network mining and analysis
Multi-label Feature Selection for Graph Classification

ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
On the Vulnerability of Large Graphs

ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
Cross lingual text classification by mining multilingual topics from wikipedia

Proceedings of the fourth ACM international conference on Web search and data mining
OddBall: spotting anomalies in weighted graphs

PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II

Role-dynamics: fast mining of large dynamic networks

Proceedings of the 21st international conference companion on World Wide Web
RolX: structural role extraction & mining in large graphs

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Towards bisociative knowledge discovery

Bisociative Knowledge Discovery
Nearly exact mining of frequent trees in large networks

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Modeling dynamic behavior in large evolving graphs

Proceedings of the sixth ACM international conference on Web search and data mining
Transforming graph data for statistical relational learning

Journal of Artificial Intelligence Research
Guided learning for role discovery (GLRD): framework, algorithms, and applications

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Spatial compactness meets topical consistency: jointly modeling links and content for community detection

Proceedings of the 7th ACM international conference on Web search and data mining

Quantified Score

Hi-index	0.01

Visualization

Abstract

Given a graph, how can we extract good features for the nodes? For example, given two large graphs from the same domain, how can we use information in one to do classification in the other (i.e., perform across-network classification or transfer learning on graphs)? Also, if one of the graphs is anonymized, how can we use information in one to de-anonymize the other? The key step in all such graph mining tasks is to find effective node features. We propose ReFeX (Recursive Feature eXtraction), a novel algorithm, that recursively combines local (node-based) features with neighborhood (egonet-based) features; and outputs regional features -- capturing "behavioral" information. We demonstrate how these powerful regional features can be used in within-network and across-network classification and de-anonymization tasks -- without relying on homophily, or the availability of class labels. The contributions of our work are as follows: (a) ReFeX is scalable and (b) it is effective, capturing regional ("behavioral") information in large graphs. We report experiments on real graphs from various domains with over 1M edges, where ReFeX outperforms its competitors on typical graph mining tasks like network classification and de-anonymization.