Learning relational probability trees

Authors:
Jennifer Neville;David Jensen;Lisa Friedland;Michael Hay
Affiliations:
University of Massachusetts, Amherst, MA;University of Massachusetts, Amherst, MA;University of Massachusetts, Amherst, MA;University of Massachusetts, Amherst, MA
Venue:
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2003

Citing 12
Cited 69

Randomization tests

Randomization tests
C4.5: programs for machine learning

C4.5: programs for machine learning
Top-down induction of first-order logical decision trees

Artificial Intelligence
Learning to extract symbolic knowledge from the World Wide Web

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Multiple Comparisons in Induction Algorithms

Machine Learning
Relational Data Mining

Relational Data Mining
Linkage and Autocorrelation Cause Feature Selection Bias in Relational Learning

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Multi-relational Decision Tree Induction

PKDD '99 Proceedings of the Third European Conference on Principles of Data Mining and Knowledge Discovery
A Machine Learning Approach to Building Domain-Specific Search Engines

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Learning Probabilistic Relational Models

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Stochastic Propositionalization of Non-determinate Background Knowledge

ILP '98 Proceedings of the 8th International Workshop on Inductive Logic Programming
Structural regression trees

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1

Exploiting relational structure to understand publication patterns in high-energy physics

ACM SIGKDD Explorations Newsletter
Using relational knowledge discovery to prevent securities fraud

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Leveraging relational autocorrelation with latent group models

MRDM '05 Proceedings of the 4th international workshop on Multi-relational mining
Leveraging Relational Autocorrelation with Latent Group Models

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Distribution-based aggregation for relational learning with identifier attributes

Machine Learning
Efficient Classification across Multiple Database Relations: A CrossMine Approach

IEEE Transactions on Knowledge and Data Engineering
Detecting outliers using transduction and statistical testing

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
First order random forests: Learning relational classifiers with complex aggregates

Machine Learning
Learning Contextual Dependency Network Models for Link-Based Classification

IEEE Transactions on Knowledge and Data Engineering
Sequential inference with reliable observations: learning to construct force-dynamic models

Artificial Intelligence
Relational Dependency Networks

The Journal of Machine Learning Research
Classification in Networked Data: A Toolkit and a Univariate Case Study

The Journal of Machine Learning Research
Relational data pre-processing techniques for improved securities fraud detection

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Generalized ordering-search for learning directed probabilistic logical models

Machine Learning
Exploiting time-varying relationships in statistical relational models

Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis
Applying link-based classification to label blogs

Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis
Using ghost edges for classification in sparsely labeled networks

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
A bias/variance decomposition for models using collective inference

Machine Learning
First-Order Probabilistic Languages: Into the Unknown

Inductive Logic Programming
ReMauve: A Relational Model Tree Learner

Inductive Logic Programming
A Method for Multi-relational Classification Using Single and Multi-feature Aggregation Functions

PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
Feature Discovery with Type Extension Trees

ILP '08 Proceedings of the 18th international conference on Inductive Logic Programming
Multirelational classification: a multiple view approach

Knowledge and Information Systems
Applying Link-Based Classification to Label Blogs

Advances in Web Mining and Web Usage Analysis
Learning to Extract Relations for Relational Classification

PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
ILP-based concept discovery in multi-relational data mining

Expert Systems with Applications: An International Journal
Using graph-based metrics with empirical risk minimization to speed up active learning on networked data

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Relational learning via latent social dimensions

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Multi-Relational Data Mining

Proceedings of the 2005 conference on Multi-Relational Data Mining
An Inductive Logic Programming Approach to Statistical Relational Learning

Proceedings of the 2005 conference on An Inductive Logic Programming Approach to Statistical Relational Learning
Learning first-order probabilistic models with combining rules

Annals of Mathematics and Artificial Intelligence
Learning models of macrobehavior in complex adaptive systems

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
A relational representation for procedural task knowledge

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
Structure learning for statistical relational models

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 4
Improving learning in networked data by combining explicit and mined links

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Online learning and exploiting relational models in reinforcement learning

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
View learning for statistical relational learning: with an application to mammography

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Sequential inference with reliable observations: Learning to construct force-dynamic models

Artificial Intelligence
Exploring optimization of semantic relationship graph for multi-relational Bayesian classification

Decision Support Systems
A Survey of Statistical Network Models

Foundations and Trends® in Machine Learning
Cautious Collective Classification

The Journal of Machine Learning Research
Adding data mining support to SPARQL via statistical relational learning methods

ESWC'08 Proceedings of the 5th European semantic web conference on The semantic web: research and applications
Semantic web enabled software analysis

Web Semantics: Science, Services and Agents on the World Wide Web
Concept discovery on relational databases: New techniques for search space pruning and rule quality improvement

Knowledge-Based Systems
Leveraging label-independent features for classification in sparsely labeled networks: an empirical study

SNAKDD'08 Proceedings of the Second international conference on Advances in social network mining and analysis
Graph regularized transductive classification on heterogeneous information networks

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Permutation testing improves Bayesian network learning

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
Discovering missing values in semi-structured databases

Large Scale Semantic Access to Content (Text, Image, Video, and Sound)
Modeling the evolution of discussion topics and communication to improve relational classification

Proceedings of the First Workshop on Social Media Analytics
Application and evaluation of inductive reasoning methods for the semantic web and software analysis

RW'11 Proceedings of the 7th international conference on Reasoning web: semantic technologies for the web of data
GA-TVRC: a novel relational time varying classifier to extract temporal information using genetic algorithms

MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
A nonparametric classification method based on K-associated graphs

Information Sciences: an International Journal
Boosting tuple propagation in multi-relational classification

Proceedings of the 15th Symposium on International Database Engineering & Applications
Refining aggregate conditions in relational learning

PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Robust collective classification with contextual dependency network models

ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications
Using trees to mine multirelational databases

Data Mining and Knowledge Discovery
An integrated approach to learning bayesian networks of rules

ECML'05 Proceedings of the 16th European conference on Machine Learning
A comparison of approaches for learning probability trees

ECML'05 Proceedings of the 16th European conference on Machine Learning
CrossMine: efficient classification across multiple database relations

Proceedings of the 2004 European conference on Constraint-Based Mining and Inductive Databases
Using the XSEDE supercomputing and visualization resources to improve tornado prediction using data mining

Proceedings of the 1st Conference of the Extreme Science and Engineering Discovery Environment: Bridging from the eXtreme to the campus and beyond
Time-Evolving relational classification and ensemble methods

PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Link prediction in complex networks based on cluster information

SBIA'12 Proceedings of the 21st Brazilian conference on Advances in Artificial Intelligence
Enhanced spatiotemporal relational probability trees and forests

Data Mining and Knowledge Discovery
Transforming graph data for statistical relational learning

Journal of Artificial Intelligence Research
Simple decision forests for multi-relational classification

Decision Support Systems
Reducing the size of databases for multirelational classification: a subgraph-based approach

Journal of Intelligent Information Systems
Multi-label relational neighbor classification using social context features

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Type Extension Trees for feature construction and learning in relational domains

Artificial Intelligence
Enhancing understanding and improving prediction of severe weather through spatiotemporal relational learning

Machine Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

Classification trees are widely used in the machine learning and data mining communities for modeling propositional data. Recent work has extended this basic paradigm to probability estimation trees. Traditional tree learning algorithms assume that instances in the training data are homogenous and independently distributed. Relational probability trees (RPTs) extend standard probability estimation trees to a relational setting in which data instances are heterogeneous and interdependent. Our algorithm for learning the structure and parameters of an RPT searches over a space of relational features that use aggregation functions (e.g. AVERAGE, MODE, COUNT) to dynamically propositionalize relational data and create binary splits within the RPT. Previous work has identified a number of statistical biases due to characteristics of relational data such as autocorrelation and degree disparity. The RPT algorithm uses a novel form of randomization test to adjust for these biases. On a variety of relational learning tasks, RPTs built using randomization tests are significantly smaller than other models and achieve equivalent, or better, performance.