Design by exmple: An application of Armstrong relations
Journal of Computer and System Sciences
Principles of database and knowledge-base systems, Vol. I
Principles of database and knowledge-base systems, Vol. I
The design of relational databases
The design of relational databases
Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Algorithms for inferring functional dependencies from relations
Data & Knowledge Engineering
Elements of machine learning
Identifying the Minimal Transversals of a Hypergraph and Related Problems
SIAM Journal on Computing
Complexity of identification and dualization of positive Boolean functions
Information and Computation
Oracles and queries that are sufficient for exact learning
Journal of Computer and System Sciences
On the complexity of dualization of monotone disjunctive normal forms
Journal of Algorithms
Advances in knowledge discovery and data mining
Advances in knowledge discovery and data mining
Fast discovery of association rules
Advances in knowledge discovery and data mining
Data mining, hypergraph transversals, and machine learning (extended abstract)
PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Generating all maximal independent sets of bounded-degree hypergraphs
COLT '97 Proceedings of the tenth annual conference on Computational learning theory
Efficiently mining long patterns from databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Depth first generation of long patterns
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Real world performance of association rule algorithms
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Levelwise Search and Borders of Theories in KnowledgeDiscovery
Data Mining and Knowledge Discovery
Machine Learning
Machine Learning
Pincer Search: A New Algorithm for Discovering the Maximum Frequent Set
EDBT '98 Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Discovering All Most Specific Sentences by Randomized Algorithms
ICDT '97 Proceedings of the 6th International Conference on Database Theory
MAFIA: A Maximal Frequent Itemset Algorithm for Transactional Databases
Proceedings of the 17th International Conference on Data Engineering
Efficiently Mining Maximal Frequent Itemsets
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Data Mining: Machine Learning, Statistics, and Databases
SSDBM '96 Proceedings of the Eighth International Conference on Scientific and Statistical Database Management
The monotone theory for the PAC-model
Information and Computation
Translating between Horn representations and their characteristic models
Journal of Artificial Intelligence Research
The complexity of mining maximal frequent itemsets and maximal frequent patterns
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
A "Go With the Winners" approach to finding frequent patterns
Proceedings of the 2005 ACM symposium on Applied computing
GenMax: An Efficient Algorithm for Mining Maximal Frequent Itemsets
Data Mining and Knowledge Discovery
Frequency-based views to pattern collections
Discrete Applied Mathematics - Special issue: Discrete mathematics & data mining II (DM & DM II)
GORDIAN: efficient and scalable discovery of composite keys
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Discovering Frequent Closed Partial Orders from Strings
IEEE Transactions on Knowledge and Data Engineering
Computational aspects of mining maximal frequent patterns
Theoretical Computer Science
Horn axiomatizations for sequential data
Theoretical Computer Science
Optimizing hypergraph transversal computation with an anti-monotone constraint
Proceedings of the 2007 ACM symposium on Applied computing
Static specification inference using predicate mining
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
A Data Mining Formalization to Improve Hypergraph Minimal Transversal Computation
Fundamenta Informaticae
Inductive Logic Programming
Providing Flexible Queries over Web Databases
KES '08 Proceedings of the 12th international conference on Knowledge-Based Intelligent Information and Engineering Systems, Part II
A Knowledge-Based Approach for Answering Fuzzy Queries over Relational Databases
KES '08 Proceedings of the 12th international conference on Knowledge-Based Intelligent Information and Engineering Systems, Part II
Minimum-Size Bases of Association Rules
ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
Recognizing unexpected recurrence behaviors with fuzzy measures in sequence databases
CSTST '08 Proceedings of the 5th international conference on Soft computing as transdisciplinary science and technology
Estimating the number of frequent itemsets in a large database
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
A view selection algorithm with performance guarantee
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Lower bounds for three algorithms for transversal hypergraph generation
Discrete Applied Mathematics
Towards a Scalable Query Rewriting Algorithm in Presence of Value Constraints
Journal on Data Semantics XII
Masking patterns in sequences: A new class of motif discovery with don't cares
Theoretical Computer Science
Efficient discovery of join plans in schemaless data
IDEAS '09 Proceedings of the 2009 International Database Engineering & Applications Symposium
Approximating the number of frequent sets in dense data
Knowledge and Information Systems
On the Complexity of Constraint-Based Theory Extraction
DS '09 Proceedings of the 12th International Conference on Discovery Science
Frequency-based views to pattern collections
Discrete Applied Mathematics - Special issue: Discrete mathematics & data mining II (DM & DM II)
On approximating minimum infrequent and maximum frequent sets
DS'07 Proceedings of the 10th international conference on Discovery science
iZi: a new toolkit for pattern mining problems
ISMIS'08 Proceedings of the 17th international conference on Foundations of intelligent systems
On the complexity of computing generators of closed sets
ICFCA'08 Proceedings of the 6th international conference on Formal concept analysis
Some fixed-parameter tractable classes of hypergraph duality and related problems
IWPEC'08 Proceedings of the 3rd international conference on Parameterized and exact computation
On active learning of record matching packages
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
On indexing error-tolerant set containment
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Hierarchical document clustering using local patterns
Data Mining and Knowledge Discovery
Parallel computation of the minimal elements of a poset
Proceedings of the 4th International Workshop on Parallel and Symbolic Computation
MARGIN: Maximal frequent subgraph mining
ACM Transactions on Knowledge Discovery from Data (TKDD)
The iZi project: easy prototyping of interesting pattern mining algorithms
PAKDD'09 Proceedings of the 13th Pacific-Asia international conference on Knowledge discovery and data mining: new frontiers in applied data mining
On probabilistic models for uncertain sequential pattern mining
ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
Mining sequential patterns from probabilistic databases
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
ReDRIVE: result-driven database exploration through recommendations
Proceedings of the 20th ACM international conference on Information and knowledge management
A parallel algorithm for computing borders
Proceedings of the 20th ACM international conference on Information and knowledge management
IDA'11 Proceedings of the 10th international conference on Advances in intelligent data analysis X
An automata approach to pattern collections
KDID'04 Proceedings of the Third international conference on Knowledge Discovery in Inductive Databases
Implicit enumeration of patterns
KDID'04 Proceedings of the Third international conference on Knowledge Discovery in Inductive Databases
Mining top-k frequent closed itemsets is not in APX
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Enumerating minimal explanations by minimal hitting set computation
KSEM'06 Proceedings of the First international conference on Knowledge Science, Engineering and Management
Enumerating minimally revised specifications using dualization
JSAI'05 Proceedings of the 2005 international conference on New Frontiers in Artificial Intelligence
Proceedings of the 2004 European conference on Constraint-Based Mining and Inductive Databases
Discovery of minimal unsatisfiable subsets of constraints using hitting set dualization
PADL'05 Proceedings of the 7th international conference on Practical Aspects of Declarative Languages
The parameterized complexity of enumerating frequent itemsets
IWPEC'06 Proceedings of the Second international conference on Parameterized and Exact Computation
Private itemset support counting
ICICS'05 Proceedings of the 7th international conference on Information and Communications Security
Inductive logic programming: yet another application of logic
INAP'05 Proceedings of the 16th international conference on Applications of Declarative Programming and Knowledge Management
On the existence of armstrong data trees for XML functional dependencies
FoIKS'10 Proceedings of the 6th international conference on Foundations of Information and Knowledge Systems
Transaction databases, frequent itemsets, and their condensed representations
KDID'05 Proceedings of the 4th international conference on Knowledge Discovery in Inductive Databases
A Data Mining Formalization to Improve Hypergraph Minimal Transversal Computation
Fundamenta Informaticae
Deciding monotone duality and identifying frequent itemsets in quadratic logspace
Proceedings of the 32nd symposium on Principles of database systems
The complexity of mining maximal frequent subgraphs
Proceedings of the 32nd symposium on Principles of database systems
Mining-based compression approach of propositional formulae
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Efficient parsing-based search over structured data
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
An efficient construction and application usefulness of rectangle greedy covers
Pattern Recognition
YmalDB: exploring relational databases via result-driven recommendations
The VLDB Journal — The International Journal on Very Large Data Bases
Mining closed patterns in relational, graph and network data
Annals of Mathematics and Artificial Intelligence
Hi-index | 0.00 |
Data mining can be viewed, in many instances, as the task of computing a representation of a theory of a model or a database, in particular by finding a set of maximally specific sentences satisfying some property. We prove some hardness results that rule out simple approaches to solving the problem.The a priori algorithm is an algorithm that has been successfully applied to many instances of the problem. We analyze this algorithm, and prove that is optimal when the maximally specific sentences are "small". We also point out its limitations.We then present a new algorithm, the Dualize and Advance algorithm, and prove worst-case complexity bounds that are favorable in the general case. Our results use the concept of hypergraph transversals. Our analysis shows that the a priori algorithm can solve the problem of enumerating the transversals of a hypergraph, improving on previously known results in a special case. On the other hand, using results for the general case of the hypergraph transversal enumeration problem, we can show that the Dualize and Advance algorithm has worst-case running time that is sub-exponential to the output size (i.e., the number of maximally specific sentences).We further show that the problem of finding maximally specific sentences is closely related to the problem of exact learning with membership queries studied in computational learning theory.