Programmatic reinforcement learning

Verma, Abhinav

Programmatic reinforcement learning

Access full-text files

VERMA-DISSERTATION-2021.pdf (4.18 MB)

Date

2021-08

Authors

Verma, Abhinav

Abstract

Programmatic Reinforcement Learning is the study of learning algorithms that can leverage partial symbolic knowledge provided in expressive high-level domain specific languages. The aim of such algorithms is to learn agents that are reliable, secure, and transparent. This means that such agents can be expected to learn desirable behaviors with limited data, while provably maintaining some essential correctness invariant, and providing insights into their decision mechanisms which can be understood by humans. Contrasted with the popular Deep Reinforcement Learning paradigm, where the learnt policy is represented by a neural network, programmatic representations are more easily interpreted and more amenable to verification by scalable symbolic methods. The interpretability and verifiability of these policies provides the opportunity to deploy reinforcement learning based solutions in safety critical environments. In this dissertation, we formalize the concept of Programmatic Reinforcement Learning, and introduce algorithms that integrate policy learning with principled mechanisms that incorporate domain knowledge. An analysis of the presented algorithms demonstrates that they posses robust theoretical guarantees and are capable of impressive performance in challenging reinforcement learning environments.