Peter
Stone

Profile

Peter is the Executive Director of Sony AI America. He is also the founder and director of the Learning Agents Research Group (LARG) within the Artificial Intelligence Laboratory in the Department of Computer Science at The University of Texas at Austin, as well as associate department chair and Director of Texas Robotics. In 2013 he was awarded the University of Texas System Regents' Outstanding Teaching Award and in 2014 he was inducted into the UT Austin Academy of Distinguished Teachers, earning him the title of University Distinguished Teaching Professor. Professor Stone's research interests in Artificial Intelligence include machine learning (especially reinforcement learning), multiagent systems, and robotics.

Professor Stone received his Ph.D. in Computer Science in 1998 from Carnegie Mellon University. From 1999 to 2002 he was a Senior Technical Staff Member in the Artificial Intelligence Principles Research Department at AT&T Labs - Research. He is an Alfred P. Sloan Research Fellow, Guggenheim Fellow, AAAI Fellow, IEEE Fellow, AAAS Fellow, ACM Fellow, Fulbright Scholar, and 2004 ONR Young Investigator. In 2007 he received the prestigious IJCAI Computers and Thought Award, given biannually to the top AI researcher under the age of 35, and in 2016 he was awarded the ACM/SIGAI Autonomous Agents Research Award.

Profile Page

Publications

A Domain-Agnostic Approach for Characterization of Lifelong Learning Systems

Neural Networks, 2023
Megan M. Baker*, Alexander New*, Mario Aguilar-Simon*, Ziad Al-Halah*, Sébastien M. R. Arnold*, Ese Ben-Iwhiwhu*, Andrew P. Brna*, Ethan Brooks*, Ryan C. Brown*, Zachary Daniels*, Anurag Daram*, Fabien Delattre*, Ryan Dellana*, Eric Eaton*, Haotian Fu*, Kristen Grauman*, Jesse Hostetler*, Shariq Iqbal*, Cassandra Kent*, Nicholas Ketz*, Soheil Kolouri*, George Konidaris*, Dhireesha Kudithipudi*, Seungwon Lee*, Michael L. Littman*, Sandeep Madireddy*, Jorge A. Mendez*, Eric Q. Nguyen*, Christine D. Piatko*, Praveen K. Pilly*, Aswin Raghavan*, Abrar Rahman*, Santhosh Kumar Ramakrishnan*, Neale Ratzlaff*, Andrea Soltoggio*, Peter Stone, Indranil Sur*, Zhipeng Tang*, Saket Tiwari*, Kyle Vedder*, Felix Wang*, Zifan Xu*, Angel Yanguas-Gil*, Harel Yedidsion*, Shangqun Yu*, Gautam K. Vallabha*

Despite the advancement of machine learning techniques in recent years, state-of-the-art systems lack robustness to “real world” events, where the input distributions and tasks encountered by the deployed systems will not be limited to the original training context, and syst…

Reward (Mis)design for autonomous driving

Artificial Intelligence, 2023
W. Bradley Knox*, Alessandro Allievi*, Holger Banzhaf*, Felix Schmitt*, Peter Stone

This article considers the problem of diagnosing certain common errors in reward design. Its insights are also applicable to the design of cost functions and performance metrics more generally. To diagnose common errors, we develop 8 simple sanity checks for identifying flaw…

Metric Residual Networks for Sample Efficient Goal-Conditioned Reinforcement Learning

AAAI, 2023
Bo Liu*, Yihao Feng*, Qiang Liu*, Peter Stone

Goal-conditioned reinforcement learning (GCRL) has a wide range of potential real-world applications, including manipulation and navigation problems in robotics. Especially in such robotics tasks, sample efficiency is of the utmost importance for GCRL since, by default, the …

The Perils of Trial-and-Error Reward Design: Misdesign through Overfitting and Invalid Task Specifications

AAAI, 2023
Serena Booth*, W. Bradley Knox*, Julie Shah*, Scott Niekum*, Peter Stone, Alessandro Allievi*

In reinforcement learning (RL), a reward function that aligns exactly with a task's true performance metric is often sparse. For example, a true task metric might encode a reward of 1 upon success and 0 otherwise. These sparse task metrics can be hard to learn from, so in pr…

DM2: Distributed Multi-Agent Reinforcement Learning via Distribution Matching

AAAI, 2023
Caroline Wang*, Ishan Durugkar*, Elad Liebman*, Peter Stone

Current approaches to multi-agent cooperation rely heavily on centralized mechanisms or explicit communication protocols to ensure convergence. This paper studies the problem of distributed multi-agent learning without resorting to centralized components or explicit communic…

BOME! Bilevel Optimization Made Easy: A Simple First-Order Approach

NeurIPS, 2022
Bo Liu*, Mao Ye*, Stephen Wright*, Peter Stone, Qiang Liu*

Bilevel optimization (BO) is useful for solving a variety of important machine learning problems including but not limited to hyperparameter optimization, meta-learning, continual learning, and reinforcement learning. Conventional BO methods need to differentiate through the…

Value Function Decomposition for Iterative Design of Reinforcement Learning Agents

NeurIPS, 2022
James MacGlashan, Evan Archer, Alisa Devlic, Takuma Seno, Craig Sherstan, Peter R. Wurman, Peter Stone

Designing reinforcement learning (RL) agents is typically a difficult process that requires numerous design iterations. Learning can fail for a multitude of reasons and standard RL methods provide too few tools to provide insight into the exact cause. In this paper, we show …

Quantifying Changes in Kinematic Behavior of a Human-Exoskeleton Interactive System

IROS, 2022
Keya Ghonasgi*, Reuth Mirsky*, Adrian M Haith*, Peter Stone, Ashish D Deshpande*

While human-robot interaction studies are becoming more common, quantification of the effects of repeated interaction with an exoskeleton remains unexplored. We draw upon existing literature in human skill assessment and present extrinsic and intrinsic performance metrics t…

Dynamic Sparse Training for Deep Reinforcement Learning

IJCAI, 2022
Ghada Sokar, Elena Mocanu, Decebal Constantin Mocanu, Mykola Pechenizkiy, Peter Stone

Deep reinforcement learning (DRL) agents are trained through trial-and-error interactions with the environment. This leads to a long training time for dense neural networks to achieve good performance. Hence, prohibitive computation and memory resources are consumed. Recentl…

Outracing Champion Gran Turismo Drivers with Deep Reinforcement Learning

Nature, 2022
Pete Wurman, Samuel Barrett, Kenta Kawamoto, James MacGlashan, Kaushik Subramanian, Thomas J. Walsh, Roberto Capobianco, Alisa Devlic, Franziska Eckert, Florian Fuchs, Leilani Gilpin, Piyush Khandelwal, Varun Kompella, Hao Chih Lin, Patrick MacAlpine, Declan Oller, Takuma Seno, Craig Sherstan, Michael D. Thomure, Houmehr Aghabozorgi, Leon Barrett, Rory Douglas, Dion Whitehead Amago, Peter Dürr, Peter Stone, Michael Spranger, Hiroaki Kitano

Many potential applications of artificial intelligence involve making real-time decisions in physical systems while interacting with humans. Automobile racing represents an extreme example of these conditions; drivers must execute complex tactical manoeuvres to pass or block…

Jointly Improving Parsing and Perception for Natural Language Commands through Human-Robot Dialog

IJCAIJAIR, 2021
Jesse Thomason*, Aishwarya Padmakumar*, Jivko Sinapov*, Nick Walker*, Yuqian Jiang*, Harel Yedidsion*, Justin Hart*, Peter Stone, Raymond J. Mooney*

In this work, we present methods for using human-robot dialog to improve language understanding for a mobile robot agent. The agent parses natural language to underlying semantic meanings and uses robotic sensors to create multi-modal models of perceptual concepts like red a…

Agent-Based Markov Modeling for Improved COVID-19 Mitigation Policies

JAIR, 2021
Roberto Capobianco, Varun Kompella, James Ault*, Guni Sharon*, Stacy Jong*, Spencer Fox*, Lauren Meyers*, Pete Wurman, Peter Stone

The year 2020 saw the covid-19 virus lead to one of the worst global pandemics in history. As a result, governments around the world have been faced with the challenge of protecting public health while keeping the economy running to the greatest extent possible. Epidemiologi…

Efficient Real-Time Inference in Temporal Convolution Networks

ICRA, 2021
Piyush Khandelwal, James MacGlashan, Pete Wurman, Peter Stone

It has been recently demonstrated that Temporal Convolution Networks (TCNs) provide state-of-the-art results in many problem domains where the input data is a time-series. TCNs typically incorporate information from a long history of inputs (the receptive field) into a singl…

Multiagent Epidemiologic Inference through Realtime Contact Tracing

AAMAS, 2021
Guni Sharon*, James Ault*, Peter Stone, Varun Kompella, Roberto Capobianco

This paper addresses an epidemiologic inference problem where, given realtime observation of test results, presence of symptoms,and physical contacts, the most likely infected individuals need to be inferred. The inference problem is modeled as a hidden Markovmodel where inf…

Expected Value of Communication for Planning in Ad Hoc Teamwork

AAAI, 2021
William Macke*, Reuth Mirsky*, Peter Stone

A desirable goal for autonomous agents is to be able to coordinate on the fly with previously unknown teammates. Known as "ad hoc teamwork", enabling such a capability has been receiving increasing attention in the research community. One of the central challenges in ad hoc …

Temporal-Logic-Based Reward Shaping for Continuing Reinforcement Learning Tasks

AAAI, 2021
Yuqian Jiang*, Sudarshanan Bharadwaj*, Bo Wu*, Rishi Shah*, Ufuk Topcu*, Peter Stone

In continuing tasks, average-reward reinforcement learning may be a more appropriate problem formulation than the more common discounted reward formulation. As usual, learning an optimal policy in this setting typically requires a large amount of training experiences. Reward…

Goal Blending for Responsive Shared Autonomy in a Navigating Vehicle

AAAI, 2021
Yu-Sian Jiang*, Garrett Warnell*, Peter Stone

Human-robot shared autonomy techniques for vehicle navigation hold promise for reducing a human driver's workload, ensuring safety, and improving navigation efficiency. However, because typical techniques achieve these improvements by effectively removing human control at cr…

A Penny for Your Thoughts: The Value of Communication in Ad Hoc Teamwork

IJCAI, 2021
Reuth Mirsky*, William Macke*, Andy Wang*, Harel Yedidsion*, Peter Stone

In ad hoc teamwork, multiple agents need to collaborate without having knowledge about their teammates or their plans a priori. A common assumption in this research area is that the agents cannot communicate. However, just as two random people may speak the same language, au…

Balancing Individual Preferences and Shared Objectives in Multiagent Reinforcement Learning

IJCAI, 2021
Ishan Durugkar*, Elad Liebman*, Peter Stone

In multiagent reinforcement learning scenarios, it is often the case that independent agents must jointly learn to perform a cooperative task. This paper focuses on such a scenario in which agents have individual preferences regarding how to accomplish the shared task. We co…

Firefly Neural Architecture Descent: a General Approach for Growing Neural Networks

NeurIPS, 2020
Lemeng Wu*, Bo Liu*, Peter Stone, Qiang Liu*

We propose firefly neural architecture descent, a general framework for progressively and dynamically growing neural networks to jointly optimize the networks' parameters and architectures. Our method works in a steepest descent fashion, which iteratively finds the best netw…

An Imitation from Observation Approach to Transfer Learning with Dynamics Mismatch

NeurIPS, 2020
Siddharth Desai*, Ishan Durugkar*, Haresh Karnan*, Garrett Warnell*, Josiah Hanna*, Peter Stone

We examine the problem of transferring a policy learned in a source environment to a target environment with different dynamics, particularly in the case where it is critical to reduce the amount of interaction with the target environment during learning. This problem is par…

Reinforcement Learning for Optimization of COVID-19 Mitigation Policies

AAAI AI for Social Good, 2020
Varun Kompella, Roberto Capobianco, Stacy Jong*, Jonathan Browne*, Spencer Fox*, Lauren Meyers*, Pete Wurman, Peter Stone

The year 2020 has seen the COVID-19 virus lead to one of the worst global pandemics in history. As a result, governments around the world are faced with the challenge of protecting public health, while keeping the economy running to the greatest extent possible. Epidemiologi…

Blog

June 17, 2021 | Sony AI

RoboCup and Its Role in the History and Future of AI

As I write this blog post, we're a few days away from the opening of the 2021 RoboCup Competitions and Symposium. Running from June 22nd-28th, this event brings together AI and robotics researchers and learners from around the wo…

As I write this blog post, we're a few days away from the opening of the 2021 RoboCup Competitions and Symposium. Running from Jun…

March 3, 2021 | Sony AI

The Challenge to Create a Pandemic Simulator

The thing I like most about working at Sony AI is the quality of the projects we're working on, both for their scientific challenges and for their potential for improving the world. What could be more exciting than magnifying hu…

The thing I like most about working at Sony AI is the quality of the projects we're working on, both for their scientific challen…

News

February 4, 2021 | Press Release

Sony AI’s Dr. Peter Stone Named Fellow by the Association for Computing Machiner…

Tokyo, Japan – February 9, 2021 -- Dr. Peter Stone, Executive Director, Sony AI America Inc., has been named a Fellow by the Association for Computing Machinery (ACM), the premier …

JOIN US

Shape the Future of AI with Sony AI

We want to hear from those of you who have a strong desire
to shape the future of AI.