The description identification problem
Artificial Intelligence
Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Machine Learning
A database perspective on knowledge discovery
Communications of the ACM
Efficiently mining long patterns from databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Multiple Comparisons in Induction Algorithms
Machine Learning
Levelwise Search and Borders of Theories in KnowledgeDiscovery
Data Mining and Knowledge Discovery
An Extension to SQL for Mining Association Rules
Data Mining and Knowledge Discovery
Discovery of frequent DATALOG patterns
Data Mining and Knowledge Discovery
Data Mining and Knowledge Discovery
Feature Construction with Version Spaces for Biochemical Applications
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Discovering All Most Specific Sentences by Randomized Algorithms
ICDT '97 Proceedings of the 6th International Conference on Database Theory
An assessment of submissions made to the Predictive Toxicology Evaluation Challenge
IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
The levelwise version space algorithm and its application to molecular fragment finding
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Data Mining as Constraint Logic Programming
Computational Logic: Logic Programming and Beyond, Essays in Honour of Robert A. Kowalski, Part II
Mining Patterns from Structured Data by Beam-Wise Graph-Based Induction
DS '02 Proceedings of the 5th International Conference on Discovery Science
Demand-Driven Construction of Structural Features in ILP
ILP '01 Proceedings of the 11th International Conference on Inductive Logic Programming
Fast Algorithms for Mining Emerging Patterns
PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
A perspective on inductive databases
ACM SIGKDD Explorations Newsletter
Mining Significant Pairs of Patterns from Graph Structures with Class Labels
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Frequent Sub-Structure-Based Approaches for Classifying Chemical Compounds
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
An Algebra for Inductive Query Evaluation
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Scalability and efficiency in multi-relational data mining
ACM SIGKDD Explorations Newsletter
Biological applications of multi-relational data mining
ACM SIGKDD Explorations Newsletter
Frequent free tree discovery in graph data
Proceedings of the 2004 ACM symposium on Applied computing
Cyclic pattern kernels for predictive graph mining
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
An Efficient Algorithm for Discovering Frequent Subgraphs
IEEE Transactions on Knowledge and Data Engineering
Extracting frequent connected subgraphs from large graph sets
Journal of Computer Science and Technology
Efficiently Mining Frequent Trees in a Forest: Algorithms and Applications
IEEE Transactions on Knowledge and Data Engineering
Frequent Substructure-Based Approaches for Classifying Chemical Compounds
IEEE Transactions on Knowledge and Data Engineering
Weighted decomposition kernels
ICML '05 Proceedings of the 22nd international conference on Machine learning
Finding Frequent Patterns in a Large Sparse Graph*
Data Mining and Knowledge Discovery
CTC — Correlating Tree Patterns for Classification
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
A framework to support multiple query optimization for complex mining tasks
MDM '05 Proceedings of the 6th international workshop on Multimedia data mining: mining integrated media and complex data
MoSS: a program for molecular substructure mining
Proceedings of the 1st international workshop on open source data mining: frequent pattern mining implementations
Dynamic Load Balancing for the Distributed Mining of Molecular Structures
IEEE Transactions on Parallel and Distributed Systems
Efficiently Mining Frequent Embedded Unordered Trees
Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences
A General Framework for Mining Frequent Subgraphs from Labeled Graphs
Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences
Mining complex power networks for blackout prevention
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Discovering frequent geometric subgraphs
Information Systems
Large scale mining of molecular fragments with wildcards
Intelligent Data Analysis
An inductive database and query language in the relational model
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Mining significant graph patterns by leap search
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Direct mining of discriminative and essential frequent patterns via model-based search tree
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Partial least squares regression for graph mining
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Kernels for Chemical Compounds in Biological Screening
ICANNGA '07 Proceedings of the 8th international conference on Adaptive and Natural Computing Algorithms, Part II
Logic and the Automatic Acquisition of Scientific Knowledge: An Application to Functional Genomics
Computational Discovery of Scientific Knowledge
Classes of Kernels for Hit Definition in Compound Screening
ICAISC '08 Proceedings of the 9th international conference on Artificial Intelligence and Soft Computing
SINDBAD and SiQL: An Inductive Database and Query Language in the Relational Model
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
An integrated, generic approach to pattern mining: data mining template library
Data Mining and Knowledge Discovery
A constraint-based querying system for exploratory pattern discovery
Information Systems
GADDI: distance index based subgraph matching in biological networks
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Mining constraint-based patterns using automatic relaxation
Intelligent Data Analysis
Large-scale graph mining using backbone refinement classes
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Correlated itemset mining in ROC space: a constraint programming approach
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Visually Guiding and Controlling the Search While Mining Chemical Structures
IWANN '09 Proceedings of the 10th International Work-Conference on Artificial Neural Networks: Part II: Distributed Computing, Artificial Intelligence, Bioinformatics, Soft Computing, and Ambient Assisted Living
Discovering Emerging Graph Patterns from Chemicals
ISMIS '09 Proceedings of the 18th International Symposium on Foundations of Intelligent Systems
Mining spatial object associations for scientific data
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Output space sampling for graph patterns
Proceedings of the VLDB Endowment
Mining graph patterns efficiently via randomized summaries
Proceedings of the VLDB Endowment
An efficient algorithm of frequent connected subgraph extraction
PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
On interactive pattern mining from relational databases
KDID'06 Proceedings of the 5th international conference on Knowledge discovery in inductive databases
High confidence fragment-based classification rule mining for imbalanced HIV data
APWeb'08 Proceedings of the 10th Asia-Pacific web conference on Progress in WWW research and development
Frequent subgraph mining on a single large graph using sampling techniques
Proceedings of the Eighth Workshop on Mining and Learning with Graphs
Frequent subgraph mining in outerplanar graphs
Data Mining and Knowledge Discovery
ILP, the blind, and the elephant: Euclidean embedding of co-proven queries
ILP'09 Proceedings of the 19th international conference on Inductive logic programming
A correlation-based approach to attribute selection in chemical graph mining
JSAI'03/JSAI04 Proceedings of the 2003 and 2004 international conference on New frontiers in artificial intelligence
An efficient distributed subgraph mining algorithm in extreme large graphs
AICI'10 Proceedings of the 2010 international conference on Artificial intelligence and computational intelligence: Part I
Improving constrained pattern mining with first-fail-based heuristics
Data Mining and Knowledge Discovery
Inductive databases and constraint-based data mining
ICFCA'11 Proceedings of the 9th international conference on Formal concept analysis
Interactive discriminative mining of chemical fragments
ILP'10 Proceedings of the 20th international conference on Inductive logic programming
Extracting and summarizing the frequent emerging graph patterns from a dataset of graphs
Journal of Intelligent Information Systems
An efficient algorithm for mining string databases under constraints
KDID'04 Proceedings of the Third international conference on Knowledge Discovery in Inductive Databases
Don't be afraid of simpler patterns
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Interestingness is not a dichotomy: introducing softness in constrained pattern mining
PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
Tree2: decision trees for tree structured data
PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
High performance subgraph mining in molecular compounds
HPCC'05 Proceedings of the First international conference on High Performance Computing and Communications
A comparison of approaches for learning probability trees
ECML'05 Proceedings of the 16th European conference on Machine Learning
Spiral mining using attributes from 3d molecular structures
AM'03 Proceedings of the Second international conference on Active Mining
Molecular fragment mining for drug discovery
ECSQARU'05 Proceedings of the 8th European conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty
A relational query primitive for constraint-based pattern mining
Proceedings of the 2004 European conference on Constraint-Based Mining and Inductive Databases
Inductive databases in the relational model: the data as the bridge
KDID'05 Proceedings of the 4th international conference on Knowledge Discovery in Inductive Databases
Efficiently Mining Frequent Embedded Unordered Trees
Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences
A General Framework for Mining Frequent Subgraphs from Labeled Graphs
Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences
Out-of-bag discriminative graph mining
Proceedings of the 28th Annual ACM Symposium on Applied Computing
Annals of Mathematics and Artificial Intelligence
Hi-index | 0.00 |
We present the application of Feature Mining techniques to the Developmental Therapeutics Program's AIDS antiviral screen database. The database consists of 43576 compounds, which were measured for their capability to protect human cells from HIV-1 infection. According to these measurements, the compounds were classified as either active, moderately active or inactive. The distribution of classes is extremely skewed: Only 1.3 % of the molecules is known to be active, and 2.7 % is known to be moderately active.Given this database, we were interested in molecular substructures (i.e., features) that are frequent in the active molecules, and infrequent in the inactives. In data mining terms, we focused on features with a minimum support in active compounds and a maximum support in inactive compounds. We analyzed the database using the levelwise version space algorithm that forms the basis of the inductive query and database system MOLFEA (Molecular Feature Miner). Within this framework, it is possible to declaratively specify the features of interest, such as the frequency of features on (possibly different) datasets as well as on the generality and syntax of them. Assuming that the detected substructures are causally related to biochemical mechanisms, it should be possible to facilitate the development of new pharmaceuticals with improved activities.