header-langage
简体中文
繁體中文
English
Tiếng Việt
한국어
日本語
ภาษาไทย
Türkçe
Scan to Download the APP

OpenRouter Launches Subagent Tool: Supports Task Offloading from Large Model to Small Model During Inference

According to Dynamic Beating monitoring, OpenRouter has launched the server-side proxy tool `openrouter:subagent` and initiated testing. This tool enables large models to dispatch independent subtasks to smaller, cheaper, and faster candidate models midway through generating content. When the main model encounters a self-contained task that does not require its full capabilities (such as document summarization, structured data extraction, template drafting, and text formatting), the proxy tool can be invoked by providing the task name and task description. The dispatched subtask is executed by the worker model and returned to the main model as an outcome for further integration.

The worker model can be any model supported by OpenRouter, either specified by the `parameters.model` in the tool definition or directly inherited from the main model if not set. To enhance execution capability, the worker model can also be equipped with standalone OpenRouter server-side tools (such as web search `openrouter:web_search` or web page retrieval `openrouter:web_fetch`), allowing multi-step inference and data retrieval in a sandbox environment before generating the final text. As the worker model runs on the server side, it does not support custom function tools that require client-side execution.

Since the worker model cannot access the main model's context session or share memory between different tasks, the main model must provide complete background information and output format requirements in the task description. To prevent infinite recursion and cost escalation caused by model nesting calls, OpenRouter has implemented a dual protection mechanism: self-referencing is prohibited in the definition and nesting depth is limited through request header to detach the proxy tool in subtask calls. Additionally, a hard limit on the total number of task executions in a single API request has been imposed.

举报 Correction/Report
Correction/Report
Submit
Add Library
Visible to myself only
Public
Save
Choose Library
Add Library
Cancel
Finish