Aware Original

Dec 24, 2025

The Overwhelming Autonomous Driving Leader Proven by Numbers: Waymo’s Full-Scale Takeoff

Ryunsu Sung avatar

Ryunsu Sung

A rendering of the 6th-generation Waymo Driver on Hyundai’s all-electric IONIQ 5 SUV.

Misconceptions About End-to-End (E2E) AI Models

A representative misconception among those who claim that Tesla’s autonomous driving technology is superior to that of Waymo, Alphabet’s subsidiary, is the presence or absence of an “end-to-end” AI model.

Starting with FSD v12, Tesla has been gradually removing “rule-based” designs (e.g., “if the light is red, stop”) and shifting to a neural net approach similar to language models like GPT or Gemini. Where the claim that Waymo still relies on “tens of thousands of lines of hard code” comes from is unclear, but it is wrong. Like Tesla, Waymo has adopted an E2E approach and is applying a more advanced, higher-level architecture.

Limits of Tesla’s Monolithic E2E Model: Autocomplete on the Road

From v12 onward, Tesla’s FSD model processes “video input → control output” within a single neural network. This is similar to human “reflexes (System 1).” The FSD model learns from Tesla’s vast trove of driving data and imitates the intuition of “this is usually what human drivers do in this situation.” This is exactly the basic principle of GPT-like LLMs: next-token prediction.

FSD is essentially the smartphone keyboard’s “autocomplete” feature transplanted onto the road. After training on the driving data of millions of drivers, it moves according to probabilistic statistics such as “after this kind of curved-road image, the most common behavior data is turning the steering wheel 15 degrees to the left.”

The problem is that the “why” is missing. Just as a keyboard’s autocomplete doesn’t truly understand the meaning of the words when it suggests “nice to meet you” after “hello,” a single AI model does not understand the causal relationship of “a child is crossing, so I must stop.” It simply repeats the pattern of “when that kind of pixel cluster appears, stop.”

When ChatGPT hallucinates and produces strange sentences or lies, users have no way of knowing why. In the same way, even if Tesla FSD causes an accident due to incorrect control in the “video input → control output” process, it is difficult for developers to figure out the reason.

“The Great Divergence: How Tesla's FSD v12 is Reshaping the Autonomous Driving Landscape” (Skywork AI, 2024). Link

Waymo’s Structured E2E Model: An Agentic Driving AI

Source: Waymo
Source: Waymo

Waymo’s EMMA (End-to-End Multimodal Model for Autonomous Driving) architecture is also E2E in the sense that the entire “data input → control output” pipeline is handled within a neural network. However, it structures the reasoning and VLM (Vision Language Model) components so that it can explain the decision-making process behind the control outputs.

Waymo explains the encoder and VLM at the front of the stack using the concept of “Thinking Fast and Slow,” coined by psychologist Daniel Kahneman:

  • Thinking Fast (System 1): The sensor-fusion encoder integrates data from cameras, lidar, radar, and other sensors to process information about objects and the environment in real time, enabling emergency responses to sudden events. It can be thought of as a “reflex” design similar to Tesla FSD and is used in situations that require immediate reactions.
  • Thinking Slow (System 2): The Driving VLM, which combines vision and language models, is used to make logical decisions in complex situations or in scenarios where training data is sparse. In its blog, Waymo uses the example of a burning car ahead: even if the road is physically clear, the system, thanks to Gemini’s reasoning capabilities, can decide that “because there is a burning car ahead, it will take an alternative route.”

“EMMA: End-to-End Multimodal Model for Autonomous Driving” (Waymo Research, 2024). Link

In other words, if Tesla is a “foundation model” like GPT, Waymo has gone a step further and built an “AI agent” that combines that with reasoning capabilities.

The structural edge of the Generative World Decoder

The Generative World Decoder that sits behind Waymo’s self-driving model goes beyond simple control and builds a world model. While this world model is similar to FSD in that it simulates the physical real world, it differs decisively from Tesla’s approach in the scope and method of its predictions.

If Tesla’s FSD intuitively derives “a single most probable trajectory” in the current situation based on massive driving data, Waymo’s Generative World Decoder performs counterfactual reasoning. Before the AI decides on an action, it effectively generates and evaluates a movie-like set of alternative futures: “If I change lanes now, will the car next to me yield, or will it accelerate?” and so on.

Rather than merely finding the optimal control values for my vehicle in the current situation, this approach simulates in advance how my actions will interact causally with surrounding vehicles and pedestrians, then chooses the safest outcome. That makes it structurally far more stable than an intuition-driven black-box model and provides “provable safety.”

“MotionLM: Multi-Agent Motion Prediction for Autonomous Driving with LLMs” (Waymo Research, 2023/2024). Link

Flywheel acceleration has already begun, and Waymo is seizing the market

Source: Waymo
Source: Waymo

The claim that Tesla enjoys an unbeatable data advantage in self-driving because of the sheer number of its cars already on the road has not held water for quite some time. Not all of the data collected is video data (if it were, Tesla would have gone bankrupt), and because the ultimate goal of autonomous driving R&D is not to replace the “average” human driver, the process of preprocessing messy real-world driver data is extremely labor-intensive and therefore very costly. Last year Tesla laid off the entire team responsible for driving data preprocessing, because the driving data had already lost its value. In practice, Tesla uses a “trigger” method: it discards ordinary driving footage and only cuts and uploads short clips when the driver intervenes or when specific pre-set conditions (e.g., construction zones, unstructured intersections) are met. In other words, “more cars on the road = more data” is a false equation, because most of that data is thrown away.

The single neural net architecture currently used by FSD also makes it exponentially more expensive to handle edge cases like the unusual scenario Waymo describes, where there is a road but the car in front is on fire. Simply dumping more driving data into the system is no solution for these long-tail situations. A monolithic black-box model is highly vulnerable to catastrophic forgetting, where solving a rare edge case causes it to lose the general driving capabilities that previously worked well. In the end, the billions of miles of data Tesla touts are merely “successful data” in which humans intervened to prevent accidents; they are not “data of failure and recovery” in which the AI itself experienced and overcame crises. When a human driver intervenes, all the AI learns is that “a person will drive here,” not what it misjudged or where the physical limits lay in that situation. My view is that Tesla FSD will ultimately have to overhaul its autonomy stack, as Waymo has done, and start rebuilding a driving dataset from scratch that fits the new architecture.

This is where the value of “pure driverless data,” which Waymo emphasizes, becomes clear. Waymo flatly states that “there is a domain that can never be replaced by simulation or test-driver data.” The experience accumulated when the driver’s seat is empty—when Waymo’s self-driving model independently perceives, judges, and responds to sudden events on the road—is a core asset that cannot be swapped for any amount of human driving data. When this high-purity data, generated solely by AI decisions without human help as it navigates real-world complexity, is fed back into Waymo’s training pipeline, autonomous driving finally surpasses the human average and completes a genuine flywheel that enters the stage of provable safety.

Metric Waymo (Alphabet) Tesla
Fully Autonomous Miles >127 Million
(Rider-only through Sept 2025)
<250,000
(Austin Pilot estimate)
Supervised / Training Miles ~100 Million+
(Simulation/Testing)
>4.1 Billion
(FSD Supervised)
Commercial Status Live Robotaxi Service
(24/7 public access)
Pilot / Testing
(Employee/Invite-only in Austin)
Active Locations Phoenix, SF, LA, Austin, Atlanta
(+ testing in FL, TX)
Global (Supervised FSD)
Austin, TX (Unsupervised Pilot)
Safety Benchmark 0.74 injury-reported crashes per million miles
(vs. 3.97 human avg)
1 crash per ~6.36 million miles
(Supervised Level 2 only)

The dawn of commercial autonomy: Waymo is ready for launch

On December 17, Bloomberg reported that Waymo is in the process of raising a massive $15 billion round at a valuation north of $100 billion. The fact that its valuation has more than doubled from $45 billion just a year ago is evidence that the market is convinced of Waymo’s technical maturity and commercial potential. The capital raised will be used to dramatically expand its roughly 2,500-vehicle robotaxi fleet and aggressively roll out service to more cities. As of December, Waymo is the only company in the United States offering paid, safety-driver-free commercial service at scale.

Waymo Is A Trillion-Dollar Opportunity. Google Just Needs To Seize It.
The self-driving tech unit is expanding but if Alphabet pushed harder, Waymo could dominate a new market with the potential to generate more revenue than its ad business.
Alan Ohnsman favicon
Forbes - Alan Ohnsman

Forbes estimates that Waymo’s fare revenue will reach at least $300 million this year, calling it “a trillion‑dollar opportunity that could surpass Google’s advertising business.” In contrast, Tesla’s robotaxi service, which has been promising to “launch next year” every year since 2016, is still confined to a limited area in Austin, Texas, and even there operates only as a pilot program for employees. Elon Musk recently posted on X that change will come “slowly, then all at once,” but given the complete lack of fully driverless operating data and the clear structural limitations of Tesla’s current model, this is closer to empty wishful thinking than reality. Waymo, armed with a record of proven safety, is poised for explosive growth starting around 2026.

Comments0

Newsletter

Be the first to get news about original content, newsletters, and special events.

Continue reading