Exploration at Scale

Exploration at Scale

By Benjamin Van Roy (Stanford University)

Talk Abstract:In producing today’s generative AIs, interactive learning primarily serves to filter what is learned from massive data corpi, encouraging responses that imitate more desirable content.Though AIs are pretrained on trillions of data bytes, even a hundred thousand bits of human feedback greatly improves behavior.And this in spite of the interactive data being gathered without the benefit of sophisticated exploration algorithms.It is conceivable that more efficient exploration coupled with the rapidly growing volume of human interaction will enable superhuman creativity.I will discuss recent results from applying uncertainty estimation and exploration algorithms in training generative AIs, as well as foundational work that supports these methodologies.

Speaker Bio:Benjamin Van Roy is a Professor at Stanford University, where he has served on the faculty since 1998. His current research focuses on reinforcement learning. Beyond academia, he leads a DeepMind Research team in Mountain View, and has also led research programs at Unica (acquired by IBM), Enuvis (acquired by SiRF), and Morgan Stanley. He received the SB in Computer Science and Engineering and the SM and PhD in Electrical Engineering and Computer Science, all from MIT, where his doctoral research was advised by John N. Tstitsiklis. He is a Fellow of INFORMS and IEEE and has served on the editorial boards of Machine Learning, Mathematics of Operations Research, for which he edited the Learning Theory Area, Operations Research, for which he edited the Financial Engineering Area, and the INFORMS Journal on Optimization. He has been a recipient of the MIT George C. Newton Undergraduate Laboratory Project Award, the MIT Morris J. Levin Memorial Master’s Thesis Award, the MIT George M. Sprowls Doctoral Dissertation Award, the National Science Foundation CAREER Award, the Stanford Tau Beta Pi Award for Excellence in Undergraduate Teaching, the Management Science and Engineering Department’s Graduate Teaching Award, and the Frederick W. Lanchester Prize.