Computational approaches to a combinatorial optimization problem arising from text classification

  • Authors:
  • Sandro Bosio;Giovanni Righini

  • Affiliations:
  • Dipartimento di Matematica "Francesco Brioschi", Politecnico di Milano, Via Bonardi 9, 20133 Milano, Italy;Dipartimento di Tecnologie dell'Informazione, Universití degli Studi di Milano, Via Bramante 65, 26013 Crema, Italy

  • Venue:
  • Computers and Operations Research
  • Year:
  • 2007

Quantified Score

Hi-index 0.01

Visualization

Abstract

We present a combinatorial optimization problem with a particular cost structure: a constrained set of elements must be chosen from a ground set and the ground set is partitioned into subsets corresponding to types of elements. The constraints concern the elements, whereas the solution cost does not depend on the elements but only on their types. The motivation of this study comes from text categorization but we believe that the same combinatorial structure may emerge in many different contexts. We prove that the problem is NP-hard. We give a 0-1 linear programming formulation and we report on computational experiences on very large instances using branch-and-bound algorithms based on two different Lagrangean relaxations and heuristic algorithms based on Threshold Accepting and Simulated Annealing.