Best cheap models for Hermes
CheapHermes runs all day, so the model price is your real subscription fee. These are the best cheap models for Hermes right now: everything costs at most about $0.75 per million input tokens, ranked by how much agent capability survives the price cut.
As of June 2026, the cheap-model question for Hermes splits on how you pay:
- Have a ChatGPT Codex subscription? Run GPT-5.4 Mini on your plan and pay nothing extra per token
- Paying per token? Route DeepSeek V4 or Gemini 3 Flash through OpenRouter with one key
- Running Hermes always-on or with sub-agents? DeepSeek V4 Flash or GLM 4.7 Flash keep a full day of runs at cents
Whatever you pick, keep one flagship configured for the hard tasks. The cheap picks handle the everyday 90% and slip on long multi-step planning.
The best cheap model for Hermes is DeepSeek V4: near-frontier quality, a 1M token context window, and open weights at $0.435 per million input tokens, all through one OpenRouter key.
Gemini 3 Flash at $0.50 per million is the fastest cheap pick for high-volume runs, and GPT-5.4 Mini at $0.75 makes sense if you already have a ChatGPT Codex subscription, since it runs on your plan instead of per token.
For always-on background work and sub-agents, drop to DeepSeek V4 Flash ($0.098) or GLM 4.7 Flash ($0.06). At those prices a full day of routine Hermes runs costs cents, not dollars.
Strong enough to be your main Hermes model, at a fraction of flagship prices.
- 11DeepSeek V4DeepSeekThe best cheap default: near-frontier quality, 1M context, $0.435 per million.Context1.049MInput$0.435/M
- 22Gemini 3 FlashGoogle DeepMindThe fastest cheap pick for high-volume runs, $0.50 per million.Context1.049MInput$0.5/M
- 33GPT-5.4 MiniOpenAIThe cheapest GPT that still handles real agent work. Codex subscription or API.Context400KInput$0.75/M
- 44Kimi K2.5Moonshot AIOpen-weight agentic coder at $0.40 per million input tokens.Context262KInput$0.375/M
- 55MiniMax-M3MiniMaxNear-frontier intelligence at $0.30 per million input tokens.Context205KInput$0.3/M
When Hermes runs continuously or fans out to sub-agents, these prices make it painless.
- 11DeepSeek V4 FlashDeepSeekA 1M context window at $0.098 per million, for always-on background work.Context1.049MInput$0.09/M
- 2Context203KInput$0.06/M
Each model's Artificial Analysis Intelligence Index score against its blended price per 1M tokens. Toward the top right is more intelligence per dollar.
What is the cheapest good model for Hermes?
DeepSeek V4 is the best cheap model for Hermes: near-frontier quality with a 1M token context window at $0.435 per million input tokens via OpenRouter. If you want the absolute floor that still works, GLM 4.7 Flash at $0.06 per million handles routine steps and sub-agents for cents per day.
How much does it cost to run Hermes on a cheap model?
A typical active day of Hermes use runs tens of millions of tokens. On DeepSeek V4 at $0.435 per million input tokens, that lands around a dollar or two per day; on DeepSeek V4 Flash or GLM 4.7 Flash it drops to cents. The same day on a flagship like Claude Opus 4.8 or GPT-5.5 can cost ten times more, which is why most Hermes users run a cheap default and escalate selectively.
Should I use a cheap model or a free model for Hermes?
Free models on OpenRouter work for trying Hermes out, but rate limits bite quickly on an agent that runs all day. The cheap picks here remove that ceiling for a few dollars a month of typical use. A common setup: a cheap model as your default, free models for experiments, and one flagship for hard tasks. See our best free models for Hermes ranking for the $0 options.
Do cheap models work with Hermes skills and tools?
Yes. DeepSeek V4, Gemini 3 Flash, GPT-5.4 Mini, Kimi K2.5, and MiniMax-M3 all handle tool calling and skill use reliably for everyday tasks. The gap versus flagships shows up in long multi-step plans, so route those to a stronger model rather than abandoning the cheap default.
Best free models for Hermes
Best models for Hermes
Best open-source models for Hermes
Agent
HermesModels
7Filter
CheapUpdated
June 2026