* External authors




Expected Value of Communication for Planning in Ad Hoc Teamwork

William Macke*

Reuth Mirsky*

Peter Stone

* External authors




A desirable goal for autonomous agents is to be able to coordinate on the fly with previously unknown teammates. Known as "ad hoc teamwork", enabling such a capability has been receiving increasing attention in the research community. One of the central challenges in ad hoc teamwork is quickly recognizing the current plans of other agents and planning accordingly. In this paper, we focus on the scenario in which teammates can communicate with one another, but only at a cost. Thus, they must carefully balance plan recognition based on observations vs. that based on communication. This paper proposes a new metric for evaluating how similar are two policies that a teammate may be following - the Expected Divergence Point (EDP). We then present a novel planning algorithm for ad hoc teamwork, determining which query to ask and planning accordingly. We demonstrate the effectiveness of this algorithm in a range of increasingly general communication in ad hoc teamwork problems.

Related Publications

Metric Residual Networks for Sample Efficient Goal-Conditioned Reinforcement Learning

AAAI, 2023
Bo Liu*, Yihao Feng*, Qiang Liu*, Peter Stone

Goal-conditioned reinforcement learning (GCRL) has a wide range of potential real-world applications, including manipulation and navigation problems in robotics. Especially in such robotics tasks, sample efficiency is of the utmost importance for GCRL since, by default, the …

The Perils of Trial-and-Error Reward Design: Misdesign through Overfitting and Invalid Task Specifications

AAAI, 2023
Serena Booth*, W. Bradley Knox*, Julie Shah*, Scott Niekum*, Peter Stone, Alessandro Allievi*

In reinforcement learning (RL), a reward function that aligns exactly with a task's true performance metric is often sparse. For example, a true task metric might encode a reward of 1 upon success and 0 otherwise. These sparse task metrics can be hard to learn from, so in pr…

DM2: Distributed Multi-Agent Reinforcement Learning via Distribution Matching

AAAI, 2023
Caroline Wang*, Ishan Durugkar*, Elad Liebman*, Peter Stone

Current approaches to multi-agent cooperation rely heavily on centralized mechanisms or explicit communication protocols to ensure convergence. This paper studies the problem of distributed multi-agent learning without resorting to centralized components or explicit communic…

  • HOME
  • Publications
  • Expected Value of Communication for Planning in Ad Hoc Teamwork


Shape the Future of AI with Sony AI

We want to hear from those of you who have a strong desire
to shape the future of AI.