Finding Memoryless Policies in Partially Observable MDPs is ETR-complete (YR-OWLS)

Sebastian Junges (UC Berkeley)

28-Oct-2020, 14:00-15:00 (3 years ago)

Abstract: Partially observable Markov Decision Processes (POMDPs) are a prime model in sequential decision making. They are heavily studied in operations research, artificial intelligence and verification, to name a few. For POMDPs, a good policy reaches the target with probability at some threshold, and makes its decisions solely based on the previously made observations. Deciding whether a good policy exists is undecidable. One of the methods to solve instances of this reachability problem is to restrict the memory that the strategy may use. This approach is commonly taking, e.g., in reinforcement learning for POMDPs. In the first part of the talk, we consider memoryless strategies in POMDPs. and show that this problem is as hard as the feasibility problem in parametric Markov Chain (pMC) analysis.

In the second part of the talk, we consider this feasibility problem for pMCs. Roughly, a pMC is a Markov chain with symbolic transitions. The feasibility problem asks: Do values for the symbols exist such that in the induced parameter-free Markov chain, a target state is reached with probability at least a half. We discuss the complexity landscape for variants of this decision problem. In particular, we establish that feasibility in pMCs is complete for the complexity class "existential theory of the reals” (ETR). Another example of an ETR-complete problem is deciding whether a multivariate polynomial has a real root. Together with the results of the first half of the talk, this establishes that deciding whether there exists a good memoryless policy in a POMDP is ETR-complete.

formal languages and automata theorylogic in computer sciencelogic

Audience: researchers in the discipline


Online Worldwide Seminar on Logic and Semantics (OWLS)

Series comments: The Online Worldwide Seminar on Logic and Semantics is a new series of research talks, highlighting the most exciting recent work in the international computer science logic community. The scope of the seminar series is roughly that of the major computer science logic conferences such as LICS, ICALP and FSCD. It takes place on most Wednesdays, with a focus every other week on the work of young researchers.

In this time of restricted international travel, a key aim of this series is to provide a forum for the informal discussion and social interaction that is so important for the progress of science. To facilitate this, the seminar incorporates in virtual form a number of features more normally associated with physical meetings.

For more details, please visit the homepage. There is no need to register for the talks, just show up.

Organizers: Alexandra Silva, Pawel Sobocinski, Jamie Vicary, Nathanaël Fijalkow, Charles Grellois, S. Krishna, Koko Muroya
Curator: Alakh Dhruv Chopra*
*contact for this listing

Export talk to