Best cheap models for Hermes

Cheap

Hermes runs all day, so the model price is your real subscription fee. These are the best cheap models for Hermes right now: everything costs at most about $0.75 per million input tokens, ranked by how much agent capability survives the price cut.

Which model should you run?

As of June 2026, the cheap-model question for Hermes splits on how you pay:

Have a ChatGPT Codex subscription? Run GPT-5.4 Mini on your plan and pay nothing extra per token
Paying per token? Route DeepSeek V4 or Gemini 3 Flash through OpenRouter with one key
Running Hermes always-on or with sub-agents? DeepSeek V4 Flash or GLM 4.7 Flash keep a full day of runs at cents

Whatever you pick, keep one flagship configured for the hard tasks. The cheap picks handle the everyday 90% and slip on long multi-step planning.

The best cheap model for Hermes is DeepSeek V4: near-frontier quality, a 1M token context window, and open weights at $0.435 per million input tokens, all through one OpenRouter key.

Gemini 3 Flash at $0.50 per million is the fastest cheap pick for high-volume runs, and GPT-5.4 Mini at $0.75 makes sense if you already have a ChatGPT Codex subscription, since it runs on your plan instead of per token.

For always-on background work and sub-agents, drop to DeepSeek V4 Flash ($0.098) or GLM 4.7 Flash ($0.06). At those prices a full day of routine Hermes runs costs cents, not dollars.

We also have more in-depth Hermes rankings:Best free models for HermesBest models for HermesBest open-source models for Hermes

The ranking

Updated June 2026

Best cheap daily drivers

Strong enough to be your main Hermes model, at a fraction of flagship prices.

#ModelContextInput

1
1
DeepSeek V4DeepSeekThe best cheap default: near-frontier quality, 1M context, $0.435 per million.
Context1.049M
Input$0.435/M
2
2
Gemini 3 FlashGoogle DeepMindThe fastest cheap pick for high-volume runs, $0.50 per million.
Context1.049M
Input$0.5/M
3
3
GPT-5.4 MiniOpenAIThe cheapest GPT that still handles real agent work. Codex subscription or API.
Context400K
Input$0.75/M
4
4
Kimi K2.5Moonshot AIOpen-weight agentic coder at $0.40 per million input tokens.
Context262K
Input$0.375/M
5
5
MiniMax-M3MiniMaxNear-frontier intelligence at $0.30 per million input tokens.
Context205K
Input$0.3/M

Best for always-on and sub-agents

When Hermes runs continuously or fans out to sub-agents, these prices make it painless.

#ModelContextInput

1
1
DeepSeek V4 FlashDeepSeekA 1M context window at $0.098 per million, for always-on background work.
Context1.049M
Input$0.09/M
2
2
GLM 4.7 FlashZ.AIThe floor: $0.06 per million for sub-agents and routine steps.
Context203K
Input$0.06/M

Intelligence vs. price

Each model's Artificial Analysis Intelligence Index score against its blended price per 1M tokens. Toward the top right is more intelligence per dollar.

Full interactive leaderboard on our Intelligence Index page.

Frequently asked questions

What is the cheapest good model for Hermes?

DeepSeek V4 is the best cheap model for Hermes: near-frontier quality with a 1M token context window at $0.435 per million input tokens via OpenRouter. If you want the absolute floor that still works, GLM 4.7 Flash at $0.06 per million handles routine steps and sub-agents for cents per day.

How much does it cost to run Hermes on a cheap model?

A typical active day of Hermes use runs tens of millions of tokens. On DeepSeek V4 at $0.435 per million input tokens, that lands around a dollar or two per day; on DeepSeek V4 Flash or GLM 4.7 Flash it drops to cents. The same day on a flagship like Claude Opus 4.8 or GPT-5.5 can cost ten times more, which is why most Hermes users run a cheap default and escalate selectively.

Should I use a cheap model or a free model for Hermes?

Free models on OpenRouter work for trying Hermes out, but rate limits bite quickly on an agent that runs all day. The cheap picks here remove that ceiling for a few dollars a month of typical use. A common setup: a cheap model as your default, free models for experiments, and one flagship for hard tasks. See our best free models for Hermes ranking for the $0 options.

Do cheap models work with Hermes skills and tools?

Yes. DeepSeek V4, Gemini 3 Flash, GPT-5.4 Mini, Kimi K2.5, and MiniMax-M3 all handle tool calling and skill use reliably for everyday tasks. The gap versus flagships shows up in long multi-step plans, so route those to a stronger model rather than abandoning the cheap default.

More rankings for Hermes

Share:

Details:

Agent
Hermes
Models
7
Filter
Cheap
Updated
June 2026

Best cheap models for Hermes

Cheap

Hermes runs all day, so the model price is your real subscription fee. These are the best cheap models for Hermes right now: everything costs at most about $0.75 per million input tokens, ranked by how much agent capability survives the price cut.

Which model should you run?

As of June 2026, the cheap-model question for Hermes splits on how you pay:

Have a ChatGPT Codex subscription? Run GPT-5.4 Mini on your plan and pay nothing extra per token
Paying per token? Route DeepSeek V4 or Gemini 3 Flash through OpenRouter with one key
Running Hermes always-on or with sub-agents? DeepSeek V4 Flash or GLM 4.7 Flash keep a full day of runs at cents

Whatever you pick, keep one flagship configured for the hard tasks. The cheap picks handle the everyday 90% and slip on long multi-step planning.

The best cheap model for Hermes is DeepSeek V4: near-frontier quality, a 1M token context window, and open weights at $0.435 per million input tokens, all through one OpenRouter key.

For always-on background work and sub-agents, drop to DeepSeek V4 Flash ($0.098) or GLM 4.7 Flash ($0.06). At those prices a full day of routine Hermes runs costs cents, not dollars.

We also have more in-depth Hermes rankings:Best free models for HermesBest models for HermesBest open-source models for Hermes

The ranking

Updated June 2026

Best cheap daily drivers

Strong enough to be your main Hermes model, at a fraction of flagship prices.

#ModelContextInput

1
1
DeepSeek V4DeepSeekThe best cheap default: near-frontier quality, 1M context, $0.435 per million.
Context1.049M
Input$0.435/M
2
2
Gemini 3 FlashGoogle DeepMindThe fastest cheap pick for high-volume runs, $0.50 per million.
Context1.049M
Input$0.5/M
3
3
GPT-5.4 MiniOpenAIThe cheapest GPT that still handles real agent work. Codex subscription or API.
Context400K
Input$0.75/M
4
4
Kimi K2.5Moonshot AIOpen-weight agentic coder at $0.40 per million input tokens.
Context262K
Input$0.375/M
5
5
MiniMax-M3MiniMaxNear-frontier intelligence at $0.30 per million input tokens.
Context205K
Input$0.3/M

Best for always-on and sub-agents

When Hermes runs continuously or fans out to sub-agents, these prices make it painless.

#ModelContextInput

1
1
DeepSeek V4 FlashDeepSeekA 1M context window at $0.098 per million, for always-on background work.
Context1.049M
Input$0.09/M
2
2
GLM 4.7 FlashZ.AIThe floor: $0.06 per million for sub-agents and routine steps.
Context203K
Input$0.06/M

Intelligence vs. price

Each model's Artificial Analysis Intelligence Index score against its blended price per 1M tokens. Toward the top right is more intelligence per dollar.

Full interactive leaderboard on our Intelligence Index page.

Frequently asked questions

What is the cheapest good model for Hermes?

How much does it cost to run Hermes on a cheap model?

Should I use a cheap model or a free model for Hermes?

Do cheap models work with Hermes skills and tools?

More rankings for Hermes

Best free models for Hermes

Best models for Hermes

Best open-source models for Hermes

Share:

Details:

Agent
Hermes
Models
7
Filter
Cheap
Updated
June 2026