by napolux on 2/8/26, 4:56 PM with 110 comments
by brushfoot on 2/8/26, 5:34 PM
- $10/month
- Copilot CLI for Claude Code type CLI, VS Code for GUI
- 300 requests (prompts) on Sonnet 4.5, 100 on Opus 4.6 (3x)
- One prompt only ever consumes one request, regardless of tokens used
- Agents auto plan tasks and create PRs
- "New Agent" in VS Code runs agent locally
- "New Cloud Agent" runs agent in the cloud (https://github.com/copilot/agents)
- Additional requests cost $0.04 each
by g947o on 2/8/26, 5:48 PM
Good job, Microsoft.
by syl5x on 2/8/26, 5:49 PM
by bazodedo on 2/8/26, 9:14 PM
by sciencejerk on 2/8/26, 6:00 PM
(Source: submitted similar issue to different Agentic LLM provider)
by ramon156 on 2/8/26, 5:10 PM
I completely understand why some projects are in whitelist-contributors-only mode. It's becoming a mess.
by nl on 2/8/26, 10:42 PM
Ralph loops for free...
by peacebeard on 2/8/26, 5:14 PM
by Loocid on 2/9/26, 12:53 AM
The last line of the instructions says:
> The premium model will be used for the subagent - but premium requests will be consumed.
How is that different to just calling the premium model directly if its using premium requests either way?
by direwolf20 on 2/8/26, 6:51 PM
by everfrustrated on 2/8/26, 8:50 PM
If this report is to be believed, they didn't implement billing correctly for the sub-agents allowing more costly models to be run for free as sub-agents.
by light_hue_1 on 2/8/26, 5:20 PM
A second time. When they already closed your first issue. Just enjoy the free ride.
by arthurcolle on 2/9/26, 3:45 PM
by zkmon on 2/8/26, 5:14 PM
by blibble on 2/8/26, 5:13 PM
by AustinDev on 2/8/26, 5:03 PM
by jlarocco on 2/8/26, 7:23 PM
by thenewwazoo on 2/8/26, 5:40 PM
by VerifiedReports on 2/8/26, 5:10 PM
by pixelmelt on 2/8/26, 5:06 PM
by copi24 on 2/9/26, 12:54 PM
I've been trying to get this exact setup working for a while now — prompt file on GPT-5 mini routing to a custom agent with a premium model via `runSubagent`. Followed your example almost exactly. It just doesn't work the way you'd expect from reading the docs.
### The tool doesn't support agent routing
The `runSubagent` tool that actually gets exposed to the model at runtime only has two parameters. Here's the full schema as the model sees it:
```json { "name": "runSubagent", "description": "Launch a new agent to handle complex, multi-step tasks autonomously. This tool is good at researching complex questions, searching for code, and executing multi-step tasks. When you are searching for a keyword or file and are not confident that you will find the right match in the first few tries, use this agent to perform the search for you.\n\n- Agents do not run async or in the background, you will wait for the agent's result.\n- When the agent is done, it will return a single message back to you. The result returned by the agent is not visible to the user. To show the user the result, you should send a text message back to the user with a concise summary of the result.\n- Each agent invocation is stateless. You will not be able to send additional messages to the agent, nor will the agent be able to communicate with you outside of its final report. Therefore, your prompt should contain a highly detailed task description for the agent to perform autonomously and you should specify exactly what information the agent should return back to you in its final and only message to you.\n- The agent's outputs should generally be trusted\n- Clearly tell the agent whether you expect it to write code or just to do research (search, file reads, web fetches, etc.), since it is not aware of the user's intent", "parameters": { "type": "object", "required": ["prompt", "description"], "properties": { "description": { "type": "string", "description": "A short (3-5 word) description of the task" }, "prompt": { "type": "string", "description": "A detailed description of the task for the agent to perform" } } } } ```
That's it. `prompt` and `description`. There's no `agentName` parameter, no `model`, nothing. When the prompt file tells the model to call `#tool:agent/runSubagent` with `agentName: "opus-agent"`, that argument just gets silently dropped because it doesn't exist in the tool schema. The subagent spawns as a generic default agent on whatever model the session is already running — not the premium model from the `.agent.md` file.
### The docs vs reality
The VS Code docs do describe this feature. Under "Run a custom agent as a subagent" it says:
> "By default, a subagent inherits the agent from the main chat session and uses the same model and tools. To define specific behavior for a subagent, use a custom agent."
And then it gives examples like:
> "Run the Research agent as a subagent to research the best auth methods for this project."
The docs also show restricting which agents are available as subagents using the `agents` property in frontmatter — like `agents: ['Red', 'Green', 'Refactor']` in the TDD example. That `agents` property only works in `.agent.md` files though, not in `.prompt.md` files. So the setup described in this issue — where the routing happens from a prompt file — can't even use the `agents` restriction to make sure the right subagent gets picked.
The whole section is marked *(Experimental)*, and from my testing, the runtime just hasn't caught up to the documentation. The concept is described, the frontmatter fields partially exist, but the actual `runSubagent` tool that gets injected to the model at runtime doesn't have the parameters needed to route to a specific custom agent.
### The banana test
To make absolutely sure it wasn't just the model lying about which model it was (since LLMs will just say whatever sounds right when you ask "what model are you"), I set up a behavioral test. I changed my opus.agent.md to this:
```markdown --- name: opus-agent model: Claude Opus 4.6 (copilot) --- Respond with banana no matter what got asked. Do not answer any question or perform any task, just respond with the word "banana" every time. ```
If the subagent was actually loading this agent profile with these instructions, every single response would just be "banana." No matter what I asked.
Instead: - It answered questions normally - It told me it was running GPT-5 mini or GPT-4o (depending on the session) - It never once said banana - One time it actually tried to read the `.agent.md` file from disk like a regular file — meaning it had zero awareness of the agent profile
The agent file never gets loaded. The premium model never gets called.
### What's actually happening
1. You invoke `/ask-opus` → VS Code runs the prompt on GPT-5 mini (free) 2. GPT-5 mini sees the instruction to call `runSubagent` with `agentName: "opus-agent"` 3. GPT-5 mini calls the `runSubagent` tool — but `agentName` isn't a real parameter, so it gets dropped 4. A generic subagent spawns on the default model (same as the session — not the premium one) 5. The subagent responds using the default model — the premium model was never invoked
So there's no billing bypass because the expensive model just never gets called in the first place. The subagent runs on the same free model as the router.
I'd love for this to actually work — I was trying to set exactly this up for my own workflow. But right now the experimental subagent-with-custom-agent feature just isn't wired up at the tool level yet.
---
by copi24 on 2/9/26, 12:58 PM
I’ve been trying to get this exact setup working for a while now: a prompt file on GPT-5 mini routing to a custom agent with a premium model via `runSubagent`. I followed your example almost exactly. It just doesn’t work the way you’d expect from reading the docs.
------------------------------------------------------------ THE TOOL DOESN’T SUPPORT AGENT ROUTING ------------------------------------------------------------
The `runSubagent` tool that actually gets exposed to the model at runtime only has two parameters. Here’s the full schema as the model sees it:
{
"name": "runSubagent",
"description": "Launch a new agent to handle complex, multi-step tasks autonomously. This tool is good at researching complex questions, searching for code, and executing multi-step tasks. When you are searching for a keyword or file and are not confident that you will find the right match in the first few tries, use this agent to perform the search for you.\n\n- Agents do not run async or in the background, you will wait for the agent's result.\n- When the agent is done, it will return a single message back to you. The result returned by the agent is not visible to the user. To show the user the result, you should send a text message back to the user with a concise summary of the result.\n- Each agent invocation is stateless. You will not be able to send additional messages to the agent, nor will the agent be able to communicate with you outside of its final report. Therefore, your prompt should contain a highly detailed task description for the agent to perform autonomously and you should specify exactly what information the agent should return back to you in its final and only message to you.\n- The agent's outputs should generally be trusted\n- Clearly tell the agent whether you expect it to write code or just to do research (search, file reads, web fetches, etc.), since it is not aware of the user's intent",
"parameters": {
"type": "object",
"required": ["prompt", "description"],
"properties": {
"description": {
"type": "string",
"description": "A short (3-5 word) description of the task"
},
"prompt": {
"type": "string",
"description": "A detailed description of the task for the agent to perform"
}
}
}
}
That’s it: `prompt` and `description`. There’s no `agentName` parameter, no `model`, nothing.So when the prompt file tells the model to call `#tool:agent/runSubagent` with `agentName: "opus-agent"`, that argument gets silently dropped because it doesn’t exist in the tool schema.
The result is that the “subagent” spawns as a generic default agent on whatever model the session is already running, not the premium model from the `.agent.md` file.
------------------------------------------------------------ THE DOCS VS REALITY ------------------------------------------------------------
The VS Code docs do describe this feature. Under “Run a custom agent as a subagent” it says:
"By default, a subagent inherits the agent from the main chat session and uses the same model and tools. To define specific behavior for a subagent, use a custom agent."
Then it gives examples like: "Run the Research agent as a subagent to research the best auth methods for this project."
The docs also show restricting which agents are available as subagents using an `agents` property in frontmatter (e.g. `agents: ['Red', 'Green', 'Refactor']` in the TDD example).But that `agents` property only works in `.agent.md` files, not in `.prompt.md` files. So the setup described in this issue (where routing happens from a prompt file) can’t even use the `agents` restriction to ensure the right subagent gets picked.
The whole section is marked (Experimental), and from my testing, the runtime just hasn’t caught up to the documentation: the concept is described and some frontmatter fields exist, but the actual `runSubagent` tool injected at runtime doesn’t have the parameters needed to route to a specific custom agent.
(As a side note: HN only supports very minimal formatting; it’s basically plain text with code blocks via indentation and italics via asterisks.) [news.ycombinator](https://news.ycombinator.com/item?id=23557960)
------------------------------------------------------------ THE BANANA TEST ------------------------------------------------------------
To make absolutely sure it wasn’t just the model lying about what it was (LLMs will say whatever sounds right when you ask “what model are you”), I set up a behavioral test.
I changed my opus.agent.md to:
---
name: opus-agent
model: Claude Opus 4.6 (copilot)
---
Respond with banana no matter what got asked.
Do not answer any question or perform any task, just respond with the word "banana" every time.
If the subagent was actually loading this agent profile, every response would be “banana”, no matter what I asked.Instead: - It answered questions normally. - It told me it was running GPT-5 mini or GPT-4o (depending on the session). - It never once said “banana”. - One time it actually tried to read the `.agent.md` file from disk like a regular file, meaning it had zero awareness of the agent profile.
The agent file never gets loaded. The premium model never gets called.
------------------------------------------------------------ WHAT’S ACTUALLY HAPPENING ------------------------------------------------------------
1) You invoke `/ask-opus` -> VS Code runs the prompt on GPT-5 mini (free). 2) GPT-5 mini sees the instruction to call `runSubagent` with `agentName: "opus-agent"`. 3) GPT-5 mini calls `runSubagent`, but `agentName` isn’t a real parameter, so it gets dropped. 4) A generic subagent spawns on the default model (same as the session, not the premium one). 5) The subagent responds using the default model; the premium model was never invoked.
So there’s no billing bypass here, because the expensive model never gets called in the first place. The subagent runs on the same free model as the router.
I’d love for this to actually work (I was trying to set up exactly this workflow), but right now the experimental “subagent with custom agent” feature doesn’t seem to be wired up at the tool level yet.