A Nearest Hyperrectangle Learning Method
Machine Learning
The Maximum Box Problem and its Application to Data Analysis
Computational Optimization and Applications
Saturated systems of homogeneous boxes and the logical analysis of numerical data
Discrete Applied Mathematics - Discrete mathematics & data mining (DM & DM)
The class cover problem with boxes
Computational Geometry: Theory and Applications
The Minimum Consistent Subset Cover Problem: A Minimization View of Data Mining
IEEE Transactions on Knowledge and Data Engineering
Hi-index | 0.04 |
In this paper, we address the problem of classifying positive and negative data with the technique known as box clustering. A box is homogeneous if it contains only positive (negative) points. Box clustering means finding a family of homogeneous boxes jointly containing all and only positive (negative) points. We first consider the problem of finding a family with the minimum number of boxes. Then we refine this problem into finding a family which not only consists of the minimum number of boxes but also has points that are covered as many times as possible by the boxes in the family. We call this problem the maximum redundancy problem. We model both problems as set covering problems with column generation. The pricing problem is a Maximum Box problem. Although this problem is NP-hard, there is available in the literature a combinatorial algorithm which performs well. Since the pricing has to be carried out also in the branch-and-bound search of the set covering problem, we also consider how the pricing has to be modified to take care of the branching constraints. The computational results show a good behavior of the set covering approach.