Abstract

Partially Observable Markov Decision Process (POMDP) is a fundamental model for probabilistic planning in stochastic domains. More recent extensions are constrained and chance-constrained POMDPs, allowing constraints to be specified on some aspects of the policy in addition to the objective function. Despite their expressive power, these models assume all actions take a fixed duration, which poses a limitation in modeling real-world planning problems. In this work, we propose a unified model for durative POMDP and its constrained extensions. First, we convert these extensions into an Integer Linear Programming (ILP) formulation, which can be solved using existing solvers in the ILP literature. Second, a heuristic search approach is provided that can efficiently prune the search space, guided by solving successive partial ILP programs. Finally, evaluation results show that our approach is empirically superior to the state-of-the-art fixed-horizon chance-constrained POMDP solver.

Majid Khonji, Duoaa Khalifa (2023). “A Unified Framework for POMDPs with Constraints and Durative Actions.” The 37th AAAI Conference on Artificial Intelligence (AAAI), Washington, DC, US. Acceptance rate 19%.