Peeking Under the Hood of GT Sophy
A technical blog series on how Sony AI trained an AI agent to outrace the world’s best Gran Turismo players
June 14, 2022
In early 2020, the incredible research and engineering team at Sony AI set out to do something that had never been done before: create an AI agent that could beat the best drivers in the world at the PlayStation® 4 game Gran Turismo™ Sport, the real driving simulator developed by Polyphony Digital. In 2021, we succeeded with Gran Turismo Sophy™ (GT Sophy). This post is the introduction to a series exploring the technical accomplishments that made GT Sophy possible.
Background But first, a little history. The Gran Turismo (GT) Game AI project actually began around 2016 when Kenta Kawamoto, then a researcher in Sony’s R&D center, started exploring the idea of using this particular game as a testbed for investigating reinforcement learning. The project began small, with much of the early work involving coordination with Polyphony Digital to build an API between GT and an agent. Two projects with interns and researchers at the University of Zurich furthered the exploration and produced the first papers, one achieving superhuman time trial performance on one race track, and a second demonstrating rudimentary passing skills. These early explorations were done using a handful of PlayStations on local networks.
With this groundwork in place, when Sony AI was formed in early 2020, we set our sights on beating expert human drivers in head-to-head racing with our initial team of a dozen world-class engineers and researchers. Our first task was to build the infrastructure that would allow us to execute the research at a much larger scale than those earlier projects. The system we built, called Dart (Distributed Asynchronous Rollouts and Training), allowed us to access Sony Interactive Entertainment’s PlayStation in the Cloud resources. This unlocked access to more than 1,000 PlayStation 4 consoles, allowing us to explore multiple ideas in parallel. Once Dart came online in January 2021 the research started in earnest—and by then our team had doubled in size.
When Sony AI was formed in early 2020, we set our sights on beating expert human drivers in head-to-head racing.
Why Is GT Sophy Significant? Our mission to build an agent for Gran Turismo was inspired by the success of other AI researchers in board games like chess and Go, and in video games. Although computers have only recently started beating the best humans in these games, the chain of research in computer game playing goes all the way back to the invention of computers; in 1948, Alan Turing and David Champernowne proposed one of the first chess programs, though they didn’t have enough computing power to make it practical.
However, there are some notable differences between the GT Sophy project and other recent AI milestones. Gran Turismo is billed as the “real driving simulator” and has an extremely realistic physics engine that captures a lot of the dynamics of real vehicles. Thus, as a domain, simulated racing is closer to a real-world application than other recent AI achievements. Unlike playing Go, driving is a skill most adults have mastered. Automobile racing is the extreme version of that common task, but we can all appreciate the skill it must take to drive at 200 mph/300 kmph wheel-to-wheel with an opponent.
Perhaps the most distinguishing feature of racing compared to these other AI accomplishments is sportsmanship. Unlike these other domains, the rules of racing are not fully enforced by the game itself; in elite racing, including Gran Turismo, there is a judge—called a steward—who evaluates the on-track action and has the power to hand out penalties when someone violates racing etiquette. Capturing this loosely defined concept was particularly challenging for our research team. The fact that competitors physically interact with each other in ways that require judges is one of the distinguishing characteristics that separates sports from games. It was not easy to find a configuration in which GT Sophy was confident enough to hold its driving line and also be respectful of its opponents.
GT Sophy is the first real AI that we could call a good sport.
Lessons Learned While the team knew a lot about reinforcement learning, we were mostly novices when it came to automobile racing. Frequently in machine learning, researchers prefer to treat the task as a black box and let the learning algorithm figure out how to solve it. That’s fine for toy domains. But one lesson I’ve learned in my career applying AI to real-world problems is the importance of having a deep understanding of the problem domain. In the case of controlling robots in a warehouse, you really need to understand that your goal is to make an efficient warehouse, not a cool robot. In the case of racing, we needed to internalize that there is a lot more to winning a race than being the fastest (though that certainly helps!). Thus, we spent a lot of time learning about automobile racing by playing Gran Turismo, reading scientific articles on other research into autonomous racing, and watching videos of top-level esports players competing in events like the FIA GT Championships Nations Cup and Manufacturer Series events.
A second important lesson was that, although this was a research project, we couldn’t ignore the engineering. Gran Turismo runs in real time independently of the agent. We spent considerable effort making sure that the communication between the game and the agent was as reliable and as fast as possible. Even so, it was a challenge to build a distributed system that could achieve a 10hz communication cycle 99.9 percent of the time. In addition, because communication was asynchronous and not perfectly reliable, we had to adapt the algorithms to handle streams of data that may include missing observations or unexpected delays between the agent’s actions.
Finally, the importance of applying good science is critical, particularly with these black-box techniques like deep reinforcement learning. I often remind my team that just because something seems to work doesn’t mean that it is correct. Similarly, just because something isn’t working doesn’t mean it can’t be made to work. Understanding why something is or isn’t working is just as, or more, important than getting a good result. For a researcher, being their own worst critic is an invaluable skill. As for the GT Sophy project, we are still on this journey. We’ve tried many different approaches over the course of the project, and while there are some things we feel we understand well, there are still others that we are continuing to investigate.
Blog Series Overview As we continue our journey, we wanted to use this opportunity to share what we have learned so far with the wider community in the format of a technical blog series. In these posts, we will cover four topics in detail. First, we will explain how GT Sophy achieved superhuman time trial performance. The second post will focus on the task of teaching GT Sophy tactical skills. The third will tackle the tricky topic of sportsmanship. And finally, in the last post, we will describe what we learned in the first race in July and how we addressed GT Sophy’s weaknesses to build a winning agent for the October rematch.
We hope you enjoy the blog series as much as we enjoyed preparing them!
June 14, 2022 | GamingGT Sophy
Training the World’s Fastest Gran Turismo Racer
GT SOPHY TECHNICAL SERIESStarting in 2020, the research and engineering team at Sony AI set out to do something that had never been done before: create an AI agent that could beat …
May 30, 2022 | Sony AI
Meet the Team #5: Craig, Piyush, and Samuel
The fifth installment of our Meet the Team series features members of the global Sony AI team who contributed to the groundbreaking research, Outracing Champion Gran Turismo Driver…
April 4, 2022 | Sony AI
Meet the Team #4: Kenta, Alisa and Thomas
The next installments of our Meet the Team series will feature members of the global Sony AI team who contributed to thegroundbreaking research, Outracing Champion Gran Turismo Dri…