Parent assignment is hard for the MDL, AIC, and NML costs

Authors:
Mikko Koivisto
Affiliations:
HIIT Basic Research Unit, Department of Computer Science, University of Helsinki, Finland
Venue:
COLT'06 Proceedings of the 19th annual conference on Learning Theory
Year:
2006

Citing 10
Cited 2

A Bayesian Method for the Induction of Probabilistic Networks from Data

Machine Learning
Learning Bayesian Networks: The Combination of Knowledge and Statistical Data

Machine Learning
On limited nondeterminism and the complexity of the V-C dimension

Journal of Computer and System Sciences
On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems

Theoretical Computer Science
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
The inapproximability of non-NP-hard optimization problems

Theoretical Computer Science
Probalistic Network Construction Using the Minimum Description Length Principle

ECSQARU '93 Proceedings of the European Conference on Symbolic and Quantitative Approaches to Reasoning and Uncertainty
A Branch-and-Bound Algorithm for MDL Learning Bayesian Networks

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Combinatorial feature selection problems

FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Finding optimal bayesian networks

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence

Learning locally minimax optimal Bayesian networks

International Journal of Approximate Reasoning
Finding optimal Bayesian networks using precedence constraints

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Several hardness results are presented for the parent assignment problem: Given m observations of n attributes x1, ..., xn, find the best parents for xn, that is, a subset of the preceding attributes so as to minimize a fixed cost function. This attribute or feature selection task plays an important role, e.g., in structure learning in Bayesian networks, yet little is known about its computational complexity. In this paper we prove that, under the commonly adopted full-multinomial likelihood model, the MDL, BIC, or AIC cost cannot be approximated in polynomial time to a ratio less than 2 unless there exists a polynomial-time algorithm for determining whether a directed graph with n nodes has a dominating set of size logn, a LOGSNP-complete problem for which no polynomial-time algorithm is known; as we also show, it is unlikely that these penalized maximum likelihood costs can be approximated to within any constant ratio. For the NML (normalized maximum likelihood) cost we prove an NP-completeness result. These results both justify the application of existing methods and motivate research on heuristic and super-polynomial-time algorithms.