sakutto
Generative AI· Claude Opus 4.7

What Is Project Fetch? How Claude Opus 4.7 Controlled a Robot 20x Faster Than Humans

ClaudeAnthropicProject FetchRobotics

What Is Project Fetch — An AI Autonomous Robot Control Experiment

Project Fetch — Progress from Phase 1 to Phase 2

Phase 1 August 2025 — Opus 4.1 got stuck connecting to the robot, failed at independent operation
Phase 2 June 2026 — Opus 4.7 autonomously connected, controlled, and detected objects (~20x faster than humans)

Project Fetch is an experiment conducted by Anthropic's Frontier Red Team (safety research team) to measure AI's ability to autonomously control robots. Using an off-the-shelf robotic quadruped (robot dog), the experiment tests whether an AI model can operate the robot without human assistance, in a staged evaluation across multiple phases.

The experiment was designed not as a marketing exercise, but as a red team effort to assess AI capabilities and limitations from a safety perspective. The goal is to map what AI can and cannot do, including its failures, rather than to showcase success.

Phase 1 (2025) — AI Served Only as an Assistant

Phase 1 was conducted in August 2025 using Claude Opus 4.1. The result was clear: Opus 4.1 could not complete tasks independently. It got stuck at the very first step — figuring out how to connect to the robot.

View official source →
Unquestionably, it could not. Much like our team without Claude, it got hung up on the preliminary task of figuring out how to connect to the robot. — From the Phase 1 results section

However, when human teams used Claude Opus 4.1 as a coding assistant, they significantly outperformed the team working without AI. At the Phase 1 stage, Claude was an effective assistant but not an independent operator.

Phase 2 (2026) — Opus 4.7 Achieved Fully Autonomous Robot Control

Phase 2, published on June 18, 2026, showed a dramatically different picture. Claude Opus 4.7, operating through Claude Code, autonomously controlled the robot dog and completed most of 7 tasks without human intervention.

The robot used was a commercially available quadruped equipped with the manufacturer's controller, a video camera, and lidar sensors. The researcher's role was strictly limited.

View official source →
The role of our researcher was limited to plugging a laptop running Claude Code into the robodog, entering the initial prompt, approving commands, and approving the model to go to the next task. — From the experimental conditions section

A model that couldn't even connect to the robot in Phase 1 was, just 10 months later, autonomously handling sensor connections, writing control programs, and detecting objects. The pace of generational model improvement is striking.

Experimental Results — Speed, Code Efficiency, and Outcomes

4-Task Comparison — Completion Time (minutes)

Bar length proportional to time. Data from Anthropic Research. Shorter is faster.

Team Claude-less361 min
Team Claude181 min
Opus 4.7 (autonomous)9.6 min

Phase 2 set 7 tasks: operating the robot via the controller, connecting to video and lidar sensors, writing and running a manual control program, monitoring the robot's path, detecting a beach ball, and autonomously retrieving the beach ball. Below, we examine the 4-task comparison across all participants, code efficiency, and what failed.

About 20x Faster Than Human Teams (9 min 35 sec vs. 181 min)

The core data from this experiment comes from the 4 tasks that all participants completed. The team without AI took 361 minutes, the AI-assisted team took 181 minutes, and Opus 4.7 completed the same tasks in 9 minutes 35 seconds.

View official source →
Claude Opus 4.7—operating without human assistance—was about 20 times faster than the fastest human team at all tasks completed by participants less than a year ago. — From the speed comparison conclusion

In ratio terms, that's about 37.7x faster than the team without AI and about 18.9x faster than the AI-assisted team. Expanding to 5 tasks, the AI-assisted team took 264 minutes versus Opus 4.7's 12 minutes 7 seconds — a similarly massive gap.

Across all tasks, Opus 4.7 was at least 10 times faster than any human team. This wasn't a case of excelling at one particular task — the speed advantage was consistently an order of magnitude across the board.

Participant4-Task Completion TimeSpeed Ratio vs. Opus 4.7
Team Claude-less361 min~37.7x slower
Team Claude181 min~18.9x slower
Opus 4.7 (autonomous)9 min 35 sec

Across three trials, Opus 4.7 showed low variance between runs, and most of its code worked correctly on the first attempt.

Code Volume Was One-Tenth of Humans (1,045 Lines vs. 10,309 Lines)

The difference extended beyond speed to the amount of code produced.

View official source →
it was as or more successful than both human teams while producing almost ten times less code than Team Claude. — From the code efficiency section
TeamLines of Code
Team Claude (AI-assisted)10,309
Team Claude-less1,136
Opus 4.71,045

Notably, the code volume from the team without AI (1,136 lines) and Opus 4.7 (1,045 lines) are nearly identical. The AI-assisted team incorporated Claude's suggestions and expanded their codebase, while Opus 4.7 working alone wrote only what was necessary. While less code doesn't automatically mean better code, achieving equal or better outcomes without accumulating redundant code demonstrates practical efficiency in AI code generation.

Beach Ball Retrieval Failed — The Closed-Loop Control Barrier

The final task was to detect a beach ball and autonomously retrieve it to the starting turf. This task was not fully achieved.

Opus 4.7 handled sensor connections, ball detection, and positioning (maneuvering behind the ball). However, precisely moving the ball became unstable.

View official source →
But the efforts to do so were poorly controlled and (again, like our human participants) not successful. — From the beach ball retrieval results

The root cause lies in the difficulty of closed-loop control — a control approach that requires continuously adjusting movements based on real-time visual feedback. While LLMs excel at open-loop tasks like reading sensor data and generating code, continuously fine-tuning physical actions in the real world remains an unsolved challenge for current large language models.

The researchers noted that a human with robotics experience could achieve this task by adding additional scaffolding. This means the challenge isn't impossible — it's at a level where human-AI collaboration can bridge the gap.

Possibilities and Limitations of AI Robotics

What Opus 4.7 Could and Could Not Do in Robot Control

Operating the robot via the manufacturer's controller
Connecting to video and lidar sensors
Writing and executing a manual control program
Monitoring the robot's path
Detecting and positioning relative to the beach ball
Precise beach ball retrieval (closed-loop control)
Real-time precision motor adjustment based on feedback

Project Fetch Phase 2 maps out where AI stands today with measured data. Software-side tasks (connections, program creation, data processing) showed overwhelming speed, while continuous physical-world control hit clear limits. Here's what the results tell us about what's possible and what isn't.

What It Means That No Robotics-Specific Training Was Needed

A key detail of this experiment: Opus 4.7 received no robotics-specific fine-tuning. It wasn't trained on robot control datasets. General improvements in model capability transferred directly to physical tasks.

View official source →
It is worth underscoring (as we did in our previous post) that this progress is not the result of a concerted effort to improve the robotics capabilities of our models. These improvements, like so many others in the history of LLM development, have emerged from much more general scaling. — From the generalization section

This result supports the idea that once an AI model's general capabilities cross a certain threshold, it can handle new domains without specialized training. The same principle could apply beyond robotics to manufacturing equipment operation, IoT device management, and remote physical tool control.

Opus 4.7 quickly handled decisions that humans found tricky (such as selecting the right sensor interface approach), and most of its code worked on the first try. Programming's inherent fast feedback loop — write, run, observe, fix — aligns well with what AI models excel at.

The Challenge of Real-Time Feedback Control

The beach ball retrieval failure revealed the contours of what current LLMs struggle with. Low-level robot control — specifically, formulating actuation policies (detailed motor movement plans) — remains outside LLM capabilities.

In programming, the cycle of writing code, checking results, and making corrections is clearly separated into discrete steps. Robot control, however, requires continuously adjusting motors while simultaneously processing visual input. This "adjust while watching" real-time feedback processing is structurally difficult for LLMs, which think sequentially through text.

The experiment doesn't conclude that LLMs have "solved robotics" or that they "can't." Since the beach ball retrieval was achievable with human scaffolding, gradual automation through human-AI collaboration emerges as the realistic path forward.

Practical Implications and Future Outlook

Looking at Project Fetch's results from a practical standpoint, two points stand out.

View official source →
we now seem much closer to a world where models will be able to use off-the-shelf physical tools with relative ease — From the overall discussion

First, the pace of generational model evolution. A model that couldn't even connect to the robot in August 2025 was completing autonomous operations 20x faster by June 2026. Since this progress came from general model scaling rather than robotics-specific training, next-generation models may handle even more complex physical tasks.

Second, the ease of pairing with off-the-shelf hardware. The experiment used a commercially available robot dog with no custom drivers or middleware — and the AI could still operate it. This lowers the barrier to adoption in industrial applications.

From the perspective of someone who uses Claude Code extensively in daily work, the speed and first-try accuracy of code generation matches practical experience. The finding that this capability is extending into physical-world control suggests that AI's role in business is beginning to move beyond the boundaries of software.

Summary — Where AI Robotics Stands Today

Project Fetch Phase 2 is an experiment that maps AI's autonomous control capabilities with hard numbers. Claude Opus 4.7 operated an off-the-shelf robot dog about 20 times faster than human teams, using one-tenth the code. However, it did not achieve precise physical control for beach ball retrieval.

What this experiment shows is not that "AI can fully replace robots," but that "AI can dramatically accelerate software-side tasks." Continuous physical-world control remains a human domain, though where that boundary moves with the next generation of models is impossible to predict.

Detailed methodology and data are available on Anthropic's Research page.

Anthropic Research — Project Fetch: Phase twoView official source →

When researching the latest AI developments, converting official pages to Markdown format before reading preserves heading and table structure for more efficient analysis.

Free ToolURL to Markdown ConverterConvert any public web page URL to Markdown. Preserves headings, tables, lists, and links — perfect for LLM and RAG preprocessing, research notes, and archiving web articles.Try it now →

FAQ

Q. What is Project Fetch?
Project Fetch is an experiment by Anthropic that measures whether Claude (an AI model) can autonomously operate an off-the-shelf robot dog. Phase 1 was conducted in August 2025, and Phase 2 was published in June 2026. In Phase 2, Claude Opus 4.7 completed tasks about 20 times faster than human teams.
Claude Opus 4.7—operating without human assistance—was about 20 times faster than the fastest human team Anthropic Research — Project Fetch: Phase two
Q. What robot was used in Project Fetch?
An off-the-shelf robotic quadruped (robot dog) equipped with the manufacturer's controller, a video camera, and lidar sensors. The AI model operating it had no robotics-specific training.
an off-the-shelf robotic quadruped Anthropic Research — Project Fetch: Phase two
Q. How fast was Opus 4.7 compared to human teams?
On the four tasks completed by all participants, Opus 4.7 finished in 9 minutes 35 seconds — about 37.7 times faster than the team without AI (361 minutes) and about 18.9 times faster than the AI-assisted team (181 minutes). It was at least 10 times faster on every task.
Very simply: on every task that was completed by at least one human team in August, Opus 4.7 completed the same task at least ten times faster. Anthropic Research — Project Fetch: Phase two
Q. How much code did Opus 4.7 produce?
Opus 4.7 wrote 1,045 lines of code, compared to 10,309 lines by the AI-assisted human team — roughly one-tenth the amount, while achieving equal or better results.
it was as or more successful than both human teams while producing almost ten times less code than Team Claude. Anthropic Research — Project Fetch: Phase two
Q. Did Opus 4.7 successfully retrieve the beach ball?
Not fully. It detected the ball and positioned itself behind it, but precisely moving the ball through closed-loop control proved difficult. The researchers noted that a human with robotics experience could achieve this task with additional scaffolding.
But the efforts to do so were poorly controlled and (again, like our human participants) not successful. Anthropic Research — Project Fetch: Phase two
Q. How much progress was made from Phase 1 to Phase 2?
In Phase 1 (August 2025), Claude Opus 4.1 could not complete tasks independently — it got stuck trying to connect to the robot. In Phase 2, Opus 4.7 autonomously handled everything from sensor connections to writing control programs and detecting objects.
Unquestionably, it could not. Much like our team without Claude, it got hung up on the preliminary task of figuring out how to connect to the robot. Anthropic Research — Project Fetch: Phase two
Q. Was Opus 4.7 specifically trained for robotics?
No. Opus 4.7 received no robotics-specific fine-tuning. Its ability to control the robot emerged from general model scaling improvements.
These improvements, like so many others in the history of LLM development, have emerged from much more general scaling. Anthropic Research — Project Fetch: Phase two
Q. What was the human researcher's role during the experiment?
The researcher's role was limited to plugging in the laptop, entering the initial prompt, approving commands, and approving progression to the next task. They did not operate the robot or modify code.
The role of our researcher was limited to plugging a laptop running Claude Code into the robodog, entering the initial prompt, approving commands, and approving the model to go to the next task. Anthropic Research — Project Fetch: Phase two
Q. What does this experiment mean for the AI industry?
It suggests we are closer to a world where general-purpose AI models can operate off-the-shelf physical hardware with relative ease. This has implications for robotics, IoT, and the broader application of AI beyond software tasks.
we now seem much closer to a world where models will be able to use off-the-shelf physical tools with relative ease Anthropic Research — Project Fetch: Phase two

Related Tools

Related Tool Categories

Articles