ACM Transactions on Database Systems (TODS)
C4.5: programs for machine learning
C4.5: programs for machine learning
New techniques for studying set languages, bag languages and aggregate functions
PODS '94 Proceedings of the thirteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Characterizing the applicability of classification algorithms using meta-level learning
ECML-94 Proceedings of the European conference on machine learning on Machine Learning
Hypothesis-Driven Constructive Induction in AQ17-HCI: A Method and Experiments
Machine Learning - Special issue on evaluating and changing representation
A Polynomial Approach to the Constructive Induction of Structural Knowledge
Machine Learning - Special issue on evaluating and changing representation
Enhanced hypertext categorization using hyperlinks
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Top-down induction of first-order logical decision trees
Artificial Intelligence
Probabilistic frame-based systems
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Relational learning with statistical predicate invention: better models for hypertext
Machine Learning - Special issue on inducive logic programming
Mining the network value of customers
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Real world performance of association rule algorithms
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Inductive Logic Programming: Techniques and Applications
Inductive Logic Programming: Techniques and Applications
Distance based approaches to relational learning and clustering
Relational Data Mining
Propositionalization approaches to relational data mining
Relational Data Mining
Data Mining and Knowledge Discovery
Automating the Construction of Internet Portals with Machine Learning
Information Retrieval
ECML '93 Proceedings of the European Conference on Machine Learning
Linkage and Autocorrelation Cause Feature Selection Bias in Relational Learning
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Discovering Test Set Regularities in Relational Domains
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Induction in first order logic from noisy training examples and fixed example set sizes
Induction in first order logic from noisy training examples and fixed example set sizes
Tree induction vs. logistic regression: a learning-curve analysis
The Journal of Machine Learning Research
Simple Estimators for Relational Bayesian Classifiers
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Aggregation-based feature invention and relational concept classes
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Learning relational probability trees
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
A survey of kernels for structured data
ACM SIGKDD Explorations Newsletter
Why collective inference improves relational classification
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Cluster-based concept invention for statistical relational learning
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Naive Bayesian Classification of Structured Data
Machine Learning
Leveraging relational autocorrelation with latent group models
MRDM '05 Proceedings of the 4th international workshop on Multi-relational mining
Probability estimation in multi-relational domains
Probability estimation in multi-relational domains
Probabilistic classification and clustering in relational data
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
ILP'02 Proceedings of the 12th international conference on Inductive logic programming
Discriminative probabilistic models for relational data
UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Gene classification: issues and challenges for relational learning
MRDM '05 Proceedings of the 4th international workshop on Multi-relational mining
Integrating Naïve Bayes and FOIL
The Journal of Machine Learning Research
Classification in Networked Data: A Toolkit and a Univariate Case Study
The Journal of Machine Learning Research
Automated social hierarchy detection through email network analysis
Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis
Rules Extraction Based on Data Summarisation Approach Using DARA
ADMA '08 Proceedings of the 4th international conference on Advanced Data Mining and Applications
Segmentation and Automated Social Hierarchy Detection through Email Network Analysis
Advances in Web Mining and Web Usage Analysis
ICML'06 Proceedings of the 2006 conference on Statistical network analysis
Medical data mining: insights from winning two competitions
Data Mining and Knowledge Discovery
Label-dependent node classification in the network
Neurocomputing
ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications
Using trees to mine multirelational databases
Data Mining and Knowledge Discovery
Transforming graph data for statistical relational learning
Journal of Artificial Intelligence Research
"Padding" bitmaps to support similarity and mining
Information Systems Frontiers
Reducing the size of databases for multirelational classification: a subgraph-based approach
Journal of Intelligent Information Systems
Classifying online social network users through the social graph
FPS'12 Proceedings of the 5th international conference on Foundations and Practice of Security
Scalable supervised dimensionality reduction using clustering
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Scaling factorization machines to relational data
Proceedings of the VLDB Endowment
Social network analysis for customer churn prediction
Applied Soft Computing
Feature enrichment and selection for transductive classification on networked data
Pattern Recognition Letters
Hi-index | 0.00 |
Identifier attributes--very high-dimensional categorical attributes such as particular product ids or people's names--rarely are incorporated in statistical modeling. However, they can play an important role in relational modeling: it may be informative to have communicated with a particular set of people or to have purchased a particular set of products. A key limitation of existing relational modeling techniques is how they aggregate bags (multisets) of values from related entities. The aggregations used by existing methods are simple summaries of the distributions of features of related entities: e.g., MEAN, MODE, SUM, or COUNT. This paper's main contribution is the introduction of aggregation operators that capture more information about the value distributions, by storing meta-data about value distributions and referencing this meta-data when aggregating--for example by computing class-conditional distributional distances. Such aggregations are particularly important for aggregating values from high-dimensional categorical attributes, for which the simple aggregates provide little information. In the first half of the paper we provide general guidelines for designing aggregation operators, introduce the new aggregators in the context of the relational learning system ACORA (Automated Construction of Relational Attributes), and provide theoretical justification. We also conjecture special properties of identifier attributes, e.g., they proxy for unobserved attributes and for information deeper in the relationship network. In the second half of the paper we provide extensive empirical evidence that the distribution-based aggregators indeed do facilitate modeling with high-dimensional categorical attributes, and in support of the aforementioned conjectures.