Sign Up

Theoretical Foundations of Safety-Critical Reinforcement Learning

Zoom link: https://cuboulder.zoom.us/j/190280621

ABSTRACT: Reinforcement learning (RL) is an optimization-based approach to problem-solving, under unknown and uncertain environments, where learning agents rely on scalar reward signals to discover optimal solutions. Frequently non-experts have to develop the requirements and their translation to rewards under significant time pressure, even though the manual translation is time-consuming and error-prone. For safety-critical applications of reinforcement learning, a rigorous design methodology is needed and, in particular, a principled approach to requirement specification and to the translation of objectives into the form required by reinforcement learning algorithms.

Formal logic provides a foundation for the rigorous and unambiguous requirement specification of learning objectives. However, reinforcement learning algorithms require requirements to be expressed as scalar reward signals. In this talk, I will present a recent technique, called limit-reachability, that bridges this gap by faithfully translating logic-based requirements into the scalar reward form needed in model-free reinforcement learning. This technique enables the synthesis of controllers that maximize the probability to satisfy given logical requirements using off-the-shelf, model-free reinforcement learning algorithms. I will also summarize the progress made by my group on developing principled methodologies and powerful tools to help programmers design and analyze safety- and security-critical software systems.

BIO: Ashutosh Trivedi is an assistant professor of computer science at the University of Colorado Boulder. His research interests lie at the intersection of theoretical computer science, machine learning, and control theory. His current research focuses on developing and applying rigorous mathematical reasoning techniques to design and analyze learning-enabled safety-critical systems. He received his doctorate in computer science with a focus on optimization and game theory from the University of Warwick. Before joining the University of Colorado Boulder, Ashutosh spent two years as an assistant professor of computer science at the Indian Institute of Technology Bombay. Earlier, he was a research fellow at the University of Pennsylvania and the University of Oxford, developing algorithmic foundations of verification and synthesis of probabilistic, real-time, and hybrid dynamical systems.

  • Zhenqi Li
  • Woody March-Steinman
  • Hansol Yoon
  • Scott Scheraga

4 people are interested in this event

User Activity

No recent activity