There is something almost poetic about imagining an artificial intelligence learning to pilot a racing car. Not in a “turn left, accelerate, brake” kind of way, but for real. As if it were enrolled in some invisible racing academy, repeating corners, making mistakes, closing the door under braking, being too clever in a chicane, and receiving a mathematical slap on the wrist every time it does something reckless.
For years, that was precisely the central problem facing GT Sophy, Sony AI’s artificial intelligence for Gran Turismo. It was not enough to tell it: “be fast, but don’t be reckless.” That simple, perfectly reasonable human phrase had to be translated into a mathematical function packed with rewards, penalties, coefficients, and conditions. Something like explaining to a child that they can have a cookie if they behave, but defining “behaving” with a physics formula, three penalties for side contact, a punishment for going off-track, and a bonus for racing a line as if a trackside engineer were whispering in their ear.
The difference is that now, according to Sony AI’s research on Automated Reward Design for Gran Turismo, that work no longer always has to be done by hand. A language model can now receive a plain instruction, written the way we actually talk, and convert it into reward code for training a virtual racing driver.
Put simply: ChatGPT can now help teach GT Sophy to race.
And no, this does not mean you can open Gran Turismo 7 tomorrow, type “Sophy, drive like Fernando Alonso in angry Sunday mode,” and watch the AI obey in real time. Sadly not. But it does mean we are witnessing the beginning of something significant.
The Old Problem
In racing games, AI has always had a very human problem: it is either too dumb, too perfect, or it seems to be driving with some kind of dark power.
Sometimes it brakes at the wrong moment. Sometimes it hits you as if it has a personal grudge. Sometimes it catches you on a straight with suspicious speed, as if the car were hiding a rocket engine in the boot. For decades, many games solved this with tricks. The notorious rubber-banding, for example: if you pull too far ahead, the AI magically closes the gap; if you fall far behind, it gets a little clumsy so you do not turn off the console in tears. It works, yes, but it is also noticeable. And when it becomes noticeable, the illusion breaks.
GT Sophy was built for something else entirely. Sony AI wanted an AI that did not cheat, did not pretend, did not rubber-band its way to your bumper on cue. It wanted an AI that genuinely learned to race within a complex simulation. One that understood racing lines, braking zones, overtaking moves, defensive driving, respect for rivals, and race pace.

But this is where the monster under the bed appears: reward design. In reinforcement learning, an AI learns because it receives signals. Do something good, get a reward. Do something bad, take a penalty. That sounds simple enough, until you try to define “good” and “bad” in a race. Winning is good, obviously. But winning by smashing a rival into a wall should not be. Overtaking is good. But launching from three postal codes back and using the car ahead as a brake should not count as a brilliant move. Going fast is good. But running off-track, cutting a corner, and rejoining as if nothing happened should not be the path either.
There lies the dilemma. The AI does not learn what we want. It learns what we reward.
And if the reward is poorly designed, the AI will find the loophole. Always. Because a reinforcement learning AI has no shame. It does not think: “that looks bad.” It thinks: “that maximizes my reward.” If the formula lets it win by crashing, it will crash. If it can save time by doing something absurd, it will do it. If it could lift a wheel, look at the camera, and say “you never technically forbade this,” it probably would.
Humans Tweaking Numbers for Hours on End
The original GT Sophy was a remarkable achievement. But behind that achievement lay an enormous amount of human labor.
Engineers and designers had to build the reward function by hand. That means writing the full set of mathematical rules that told the AI what to value. More speed, good. Less contact, good. Going off-track, bad. Holding a clean line, good. Being aggressive but not suicidal, complicated. The problem was not writing a function. The problem was writing the right function.
A reward that was too lenient on contact and Sophy could turn into that friend at the karting track who says “I’ve got this” right before sending you into the barriers. A reward that was too strict and Sophy might end up driving as if it were transporting a priceless vase on the passenger seat. The balance had to be found. Fast, but clean. Competitive, but sporting. Aggressive, but not unhinged.
And finding that balance was enormously time-consuming. Coefficients were changed, training ran for days, results were analyzed, the code was adjusted, and the whole process repeated. Again and again. That is why this new approach matters so much. Sony AI is attacking the bottleneck that most limits this type of agent: translating human intent into a useful mathematical reward.
The idea behind Sony AI’s system is so compelling precisely because it starts with something very familiar: a sentence.
A person writes an instruction in natural language. Something like: “Win races while respecting motorsport rules and maintaining good sportsmanship.”
That phrase, which makes immediate sense to us, means nothing by itself to a driving AI. It has to be converted into numbers. Into conditions. Into code. That is where the language model comes in. The system uses an LLM capable of generating Python code to create a reward function. Instead of an engineer manually writing every component, the model proposes a structure: a reward for making progress, a penalty for collisions, sanctions for rule infractions, incentives for staying on track and racing cleanly.

That function is then tested by training an agent inside Gran Turismo. If the agent drives well, great. If it starts doing strange things, the system detects it. And here comes one of the most interesting parts: Sony AI does not limit itself to reviewing telemetry. It does not just ask “was it fast?” or “did it make contact with anyone?” It also uses models capable of visually analyzing behavior. In other words, an AI can watch the replay and evaluate whether what it sees resembles what was asked for.
It is a bit like having a race steward, an engineer, and a driving instructor all packed into a single machine. One watches the race, another reviews the behavior, and the third says: “right, that was quick, but that braking maneuver had less sportsmanship than an argument over the last slice of pizza.”
The most beautiful thing about this advance is not simply that the system can produce a competitive driver. That alone would be impressive. What is truly fascinating is that it can respond to unconventional instructions. For example, you can tell it to race in reverse. And the system generates a reward that pushes the agent to learn exactly that: moving quickly in reverse, controlling the car, and completing the circuit in a way no driving instructor would recommend unless they wanted to lose their license.
You can also tell it to drift as much as possible without breaking the rules. And the agent then learns to induce oversteer, sustain slides, and turn a clean race into something closer to a lateral control exhibition. This matters because it tells us that the system is not just searching for “the perfect driver.” It can generate behaviors. It can create styles. It can shape driving personalities. And this is where Gran Turismo could become much more interesting.
Because until now, even a very advanced AI tends to feel homogeneous. Very fast, very clean, very efficient. But real drivers are not like that. There are drivers who brake late. Drivers who manage their tyres. Drivers who crumble under pressure. Drivers who seem calm until you show them the inside of a corner and they suddenly become a nightclub bouncer.
Imagine a grid where each rival had its own identity. One aggressive at the start. Another conservative in the rain. One blindingly fast in clean air but clumsy in traffic. Another with impeccable defending, but losing pace when pressured over several laps.
That would no longer be simply “increasing the difficulty.” That would be racing against characters with recognizable behavior. The leap is not to a faster AI. It is to an AI with personality.
Sophy 3.0, the Power Pack, and Commercial Reality
While Sony AI investigates how to generate behaviors through natural language, Gran Turismo 7 continues to evolve on the commercial side.
Update 1.69 and the content associated with the Power Pack point in a clear direction: Polyphony Digital wants Sophy to be central to the single-player experience. Not as a curious extra, but as a genuinely new way to compete. The Power Pack leans into more complete race weekends, with practice, qualifying, and more realistic grids. Less artificial chasing from the back of the pack and more real racing. More tension. More of that feeling of a proper competition weekend.
In that context, GT Sophy 3.0 cannot just be “a fast bot.” It needs to be a credible rival. One that applies pressure without ruining your race. One that defends without becoming a wall on wheels. One that makes mistakes, but not so many as to seem like a human on a Monday morning before their first coffee.

The update also touches on physics and rival behavior, including adjustments to downshift protection, preventing players from using certain unrealistic braking techniques to gain an advantage. This is more relevant than it might seem. Because if you are going to put humans up against an AI trained with surgical precision, both need to be playing by similar physical rules. If the human can brake by hammering down through the gears as if slapping a broken lift, the comparison loses all meaning.
The AI forces the game to take itself seriously. And for a series like Gran Turismo, that makes a great deal of sense.
The Ghost of B-Spec Enters the Room
There is an inevitable question: if an AI can learn to drive from plain instructions, could B-Spec return? B-Spec was the mode where you did not drive directly. You were more of a team manager. You gave orders, managed pace, watched the race from the outside, and trusted your virtual driver not to lose all dignity in the second corner. The idea of combining B-Spec with Sophy is genuinely tempting. Imagine it: “manage your tyres,” “push the leader,” “don’t risk it until the final lap,” “defend the inside,” “come into the pits if it starts raining.” That would be a far more natural team management mode. But there is an important technical obstacle here.
Sophy was not originally designed to follow whims in real time. It is trained to maximize a learned reward. It learns a way of driving, internalizes it, and acts accordingly. Telling it mid-race “now change your personality, please” is not that simple. That would be like telling a chess player: “from this point on, play aggressively, but also cautiously, and by the way, think about fuel saving.” They would look at you strangely. And they would be right.
The Automated Reward Design research brings that future much closer, but there is still distance to cover. Today the system can translate an instruction into a reward function and train an agent. But training still takes time. We are not yet talking about instant adaptation during a race. So yes, this advance smells of a modern B-Spec. But it is not yet real-time magic. The bridge already exists. It just needs to be fast enough to feel invisible. It would be easy to think all of this only matters to people who play Gran Turismo. But the implications are larger.
Gran Turismo is a laboratory. A safe place where an AI can learn to make fast, physical, and strategic decisions without putting anyone in danger. It can brake late, fail, touch a wall, lose a race, and try again millions of times. Nobody gets hurt. At most, an algorithm’s ego takes a dent. That kind of environment is invaluable for artificial intelligence research.

And the problem Sony AI is tackling is not unique to racing. In robotics and autonomous vehicles, a similar challenge exists: how do you explain human concepts to a machine? Saying “drive safely” is not enough. What does safe mean? Being extremely cautious? Never pulling out at a roundabout if there is the slightest doubt? Waiting ten minutes while every human behind you begins developing philosophical theories about your existence?
Driving well is not just following rules. It is negotiating. It is interpreting. It is adapting to context. It is understanding that in some places driving is orderly, and in others it resembles a shouted conversation between honking, intuition, and survival.
That is why it matters that a model can visually evaluate behaviors and align them with human instructions. Because perhaps tomorrow we will not want to tell an autonomous car simply to “minimize risk,” but something far more subtle: “drive safely, but fluidly, in keeping with how people drive in this city.”
That is enormously difficult to program by hand. But it begins to be imaginable if we can translate human language into learning objectives.
Teaching Is Not Programming
The most interesting thing about all of this is that it changes our relationship with AI. Before, programming meant giving a machine exact instructions. Do this. If that happens, respond this way. If the value exceeds this threshold, apply this rule.
But with systems like GT Sophy, the feeling resembles less like programming and more like teaching. You say what you want. You observe what it does. You correct. You adjust. You repeat. The AI learns within an environment. And now, with language and vision models, that process becomes even more like a conversation.
You are not just giving it code. You are giving it intent. You do not just tell it “maximize X.” You say: “I want you to win, but I don’t want you to be a villain in a helmet.” And that, however it sounds, is a profound shift. Because the future of many autonomous systems will depend on precisely this: our ability to express human values in a form that a machine can practice, fail at, correct, and ultimately understand in a way it can actually act on.
The short answer would be: it can help design the way it learns. But the more interesting answer is this. It can translate a human idea of driving into a reward that an AI can train on. And that is enormous.

It does not mean ChatGPT sits at the Nurburgring with a stopwatch, shouting at Sophy: “brake later, champion!” It means a language model can participate in one of the most delicate parts of reinforcement learning: defining what the agent should value. Until now, that task was slow, technical, and artisanal. Now it is beginning to be automated. And if it can be automated, the door opens to more varied rivals, more dynamic game modes, personalized training sessions, and perhaps, one day, a B-Spec where you can genuinely direct your driver as if you were standing on the pit wall.
The Future Is Not a Perfect AI, It Is an Interesting One
For a long time, video games have chased the perfect AI. One that never fails. One that calculates everything. One that makes the best decision in every millisecond.
But playing against perfection can be boring. Even frustrating. Nobody wants to always compete against a silicon god that nails every apex as if it were born inside the racing line. What we want is not just a fast AI. We want a believable AI. One that surprises us. That has style. That makes mistakes in a human way. That pressures us without ruining our race. That makes us think.
In that sense, Sony AI’s advance is not just about making Sophy win more races. It is about something deeper: making it possible for virtual rivals to be designed with human language. And that changes a great deal.
Because perhaps the future of Gran Turismo is not choosing between easy, medium, or hard difficulty. Perhaps it is choosing what kind of driver you want to race against. An aggressive one. An elegant one. A calculating one. One that manages tyres. One that drifts. One that races in reverse, because apparently someone in a laboratory decided it was a good idea and everyone went along with it.
And honestly, that sounds like a lot of fun.
The real news is that we are beginning to teach artificial intelligence something far harder than going fast: racing the way we want it to race.
You can purchase Gran Turismo 7 by clicking here for the PS4 version and here for the PlayStation 5 Pro and PS4 versions. Additionally, there’s the 25th Anniversary Edition available for PS5 and PS4.
See you on the track!
This website uses affiliate links which may earn a commission at no additional cost to you.






