Solving inverse frequent itemset mining with infrequency constraints via large-scale linear programs
ACM Transactions on Knowledge Discovery from Data (TKDD)
Hi-index | 0.00 |
The inverse frequent set mining problem is the problem of computing a database on which a given collection of itemsets must emerge to be frequent. Earlier studies focused on investigating computational and approximability properties of this problem. In this paper, we face it under the pragmatic perspective of defining heuristic solution approaches that are effective and scalable in real scenarios. In particular, a general formulation of the problem is considered where minimum and maximum support constraints can be defined on each itemset, and where no bound is given beforehand on the size of the resulting output database. Within this setting, an algorithm is proposed that always satisfies the maximum support constraints, but which treats minimum support constraints as soft ones that are enforced as long as possible. A thorough experimentation evidences that minimum support constraints are hardly violated in practice, and that such negligible degradation in accuracy (which is unavoidable due to the theoretical intractability of the problem) is compensated by very good scaling performances.