Sign Up

Abstract: Reinforcement Learning (RL) is an optimization-based approach to problem-solving where learning agents rely on scalar reward signals to discover optimal solutions. The recent success of RL has demonstrated its potential as a viable alternative to "human" programming. However, observing these success stories closely, it is evident that significant expertise is required in deploying RL. This expertise is required in designing a suitable approximation architecture and designing a suitable "flat" representation of the environment in a form required by the architecture. Besides, it is also needed to specify objectives in the language of scalar rewards. This rigid interface---in the form of feature constructions, manual approximations, and reward engineering---between the programmers and the RL algorithms is cumbersome and error-prone. The resulting lack of usability and trust contributes towards barriers to entry in this promising field. My group is working towards democratizing reinforcement learning by developing principled methodologies and powerful tools to improve the usability and trustworthiness of RL-based programming at scale.

The aforementioned low-level interactions between the programmers and the RL are akin to programming systems in a low-level assembly language. I envision a programmatic approach to RL where the programmers interact with the RL algorithms by writing programs in a high-level programming language expressing the simulation environment, the choices available to the learning agent, and the learning objectives, while an underlying “interpreter” frees the programmer from the burden of feature construction and approximation heuristics demanded by the state-of-the-art RL algorithms. We dub this setting high-level programmatic reinforcement learning (or programmatic RL for short).

To realize the promise of improved usability of programmatic RL, we need RL algorithms capable of efficiently handling rich programmatic features (functional recursion and recursive data structures) and complex dynamical models (governed by ordinary differential equations) while guaranteeing convergence to the optimal value. To enable transparent and trustworthy RL, we need translation schemes to compile learning requirements expressed in high-level languages to scalar reward signals. In this talk, I will summarize our efforts and breakthroughs towards a framework for programmatic RL capable of reasoning with formal requirements, real-time constraints, and recursive environments.

Bio: Ashutosh Trivedi received his B.Eng. in Computer Science from NIT Nagpur in 2000, his M.Tech. in Electrical Engineering from IIT Bombay in 2003, and his Ph.D. in Computer Science from the University of Warwick in 2009. He was a postdoctoral researcher at the University of Oxford and at the University of Pennsylvania between 2009 and 2012. He was an assistant professor of Computer Science at IIT Bombay from 2013-2015. He joined the University of Colorado Boulder in 2015, where he is currently an assistant professor of Computer Science. He is also a member of the Programming Languages and Verification (CUPLV) Group. His research interests include formal methods, optimization, and game theory with applications in trustworthy AI, cyber-physical systems, software security, and fairness in AI. He is a recipient of an NSF CAREER award, two AFRL fellowships, and a Liverpool-India fellowship.

 

https://cuboulder.zoom.us/j/190280621

 

  • Sruthi Sampath Kumar

1 person is interested in this event

User Activity

No recent activity