Automating the evaluation of planning systems

Authors:
Carlos Linares López;Sergio Jiménez;Malte Helmert
Affiliations:
Computer Science Department, Universidad Carlos III de Madrid, Madrid, Spain. E-mails: {carlos.linares, sergio.jimenez}@uc3m.es;Computer Science Department, Universidad Carlos III de Madrid, Madrid, Spain. E-mails: {carlos.linares, sergio.jimenez}@uc3m.es;Department of Mathematics and Computer Science, Universität Basel, Basel, Switzerland. E-mail: malte.helmert@unibas.ch
Venue:
AI Communications
Year:
2013

Citing 25
Cited 0

CPlan: a constraint programming approach to planning

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Planning as heuristic search

Artificial Intelligence - Special issue on heuristic search in artificial intelligence
A Computer Model of Skill Acquisition

A Computer Model of Skill Acquisition
Experimental Evaluation of Heuristic Optimization Algorithms: A Tutorial

Journal of Heuristics
Planning via Model Checking: A Decision Procedure for AR

ECP '97 Proceedings of the 4th European Conference on Planning: Recent Advances in AI Planning
VAL: Automatic Plan Validation, Continuous Effects and Mixed Initiative Planning Using PDDL

ICTAI '04 Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence
Learning from planner performance

Artificial Intelligence
Deterministic planning in the fifth international planning competition: PDDL3 and experimental evaluation of the planners

Artificial Intelligence
Temporal dynamic controllability revisited

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
The FF planning system: fast plan generation through heuristic search

Journal of Artificial Intelligence Research
The 3rd international planning competition: results and analysis

Journal of Artificial Intelligence Research
PDDL2.1: an extension to PDDL for expressing temporal planning domains

Journal of Artificial Intelligence Research
SHOP2: an HTN planning system

Journal of Artificial Intelligence Research
The deterministic part of IPC-4: an overview

Journal of Artificial Intelligence Research
The fast downward planning system

Journal of Artificial Intelligence Research
A critical assessment of benchmark comparison in planning

Journal of Artificial Intelligence Research
Unifying SAT-based and graph-based planning

IJCAI'99 Proceedings of the 16th international joint conference on Artifical intelligence - Volume 1
When is temporal planning really temporal?

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Faster heuristic search algorithms for planning with uncertainty and full feedback

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
A translation-based approach to contingent planning

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Soft goals can be compiled away

Journal of Artificial Intelligence Research
RL-Glue: Language-Independent Software for Reinforcement-Learning Experiments

The Journal of Machine Learning Research
Pushing the envelope: planning, propositional logic, and stochastic search

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
The first learning track of the international planning competition

Machine Learning
Planning under partial observability by classical replanning: theory and experiments

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three

Quantified Score

Hi-index	0.00

Visualization

Abstract

Research in automated planning is getting more and more focused on empirical evaluation. Likewise the need for methodologies and benchmarks to build solid evaluations of planners is increasing. In 1998 the planning community made a move to address this need and initiated the International Planning Competition --or IPC for short. This competition has typically been conducted every two years in the context of the International Conference on Automated Planning and Scheduling ICAPS and tries to define standard metrics and benchmarks to reliably evaluate planners. In the sixth edition of the competition, IPC 2008, there was an attempt to automate the evaluation of all entries in the competition which was imitated to a large extent and extended in several ways in the seventh edition, IPC 2011. As a result, a software for automatically running planning experiments and inspecting the results is available, encouraging researchers to use it for their own research interests. The software allows researchers to reproduce and inspect the results of IPC 2011, but also to generate and analyze new experiments with private sets of planners and problems. In this paper we provide a gentle introduction to this software and examine the main difficulties, both from a scientific and engineering point of view, in assessing the performance of automated planners.