Choosing Your Model

Model quality evolves quickly, and we tune the CLI defaults as the ecosystem shifts. Use this guide as a snapshot of how the major options compare today, and expect to revisit it as we publish updates. This guide was last updated on Wednesday, October 23rd 2025.

1 · Current stack rank (October 2025)

Rank	Model	Why we reach for it
1	Claude Sonnet 4.5	Recommended daily driver. Excellent balance of quality, speed, and cost for most development tasks. Current CLI default.
2	GPT-5 Codex	Fast iteration loops with strong coding performance. Great for implementation-heavy work at lower cost than Sonnet.
3	Claude Haiku 4.5	Fast and cost-effective for routine tasks, quick iterations, and high-volume automation. Best for speed-sensitive workflows.
4	Droid Core (GLM-4.6)	Open-source model with 0.25× token multiplier. Lightning-fast and budget-friendly for automation, bulk edits, and air-gapped environments.
5	GPT-5	Strong generalist from OpenAI. Choose when you prefer OpenAI ergonomics or need specific GPT features.
6	Claude Opus 4.1	Highest capability for extremely complex work. Use when you need maximum reasoning power for critical architecture decisions or tough problems.

We ship model updates regularly. When a new release overtakes the list above, we update this page and the CLI defaults.

2 · Match the model to the job

Scenario	Recommended model
Deep planning, architecture reviews, ambiguous product specs	Start with Sonnet 4.5 for strong reasoning at practical cost. Use GPT-5 Codex for faster iteration or Haiku 4.5 for lighter tasks.
Full-feature development, large refactors	Sonnet 4.5 is the recommended daily driver. Try GPT-5 Codex when you want faster loops or Droid Core for high-volume work.
Repeatable edits, summarization, boilerplate generation	Haiku 4.5 or Droid Core for speed and cost savings. GPT-5 or Sonnet 4.5 when you need higher quality.
CI/CD or automation loops	Favor Haiku 4.5 or Droid Core for predictable throughput at low cost. Use Sonnet 4.5 or Codex for complex automation.
High-volume automation, frequent quick turns	Haiku 4.5 for speedy feedback loops. Droid Core when cost is critical or you need air-gapped deployment.

Claude Opus 4.1 remains available for extremely complex architecture decisions or critical work where you need maximum reasoning capability. Most tasks don’t require Opus-level power—start with Sonnet 4.5 and escalate only if needed.

Tip: you can swap models mid-session with /model or by toggling in the settings panel (Shift+Tab → Settings).

3 · Switching models mid-session

Use /model (or Shift+Tab → Settings → Model) to swap without losing your chat history.
If you change providers (e.g. Anthropc to OpenAI), the CLI converts the session transcript between Anthropic and OpenAI formats. The translation is lossy—provider-specific metadata is dropped—but we have not seen accuracy regressions in practice.
For the best context continuity, switch models at natural milestones: after a commit, once a PR lands, or when you abandon a failed approach and reset the plan.
If you flip back and forth rapidly, expect the assistant to spend a turn re-grounding itself; consider summarizing recent progress when you switch.

4 · Reasoning effort settings

Anthropic models (Opus/Sonnet/Haiku) show modest gains between Low and High.
GPT models respond much more to higher reasoning effort—bumping GPT-5 or GPT-5 Codex to High can materially improve planning and debugging.
Reasoning effort increases latency and cost, so start Low for simple work and escalate when you need more depth.

Change reasoning effort from /model → Reasoning effort, or via the settings menu.

5 · Bring Your Own Keys (BYOK)

Factory ships with managed Anthropic and OpenAI access. If you prefer to run against your own accounts, BYOK is opt-in—see Bring Your Own Keys for setup steps, supported providers, and billing notes.

Open-source models

Droid Core (GLM-4.6) is an open-source alternative available in the CLI. It’s useful for:

Air-gapped environments where external API calls aren’t allowed
Cost-sensitive projects needing unlimited local inference
Privacy requirements where code cannot leave your infrastructure
Experimentation with open-source model capabilities

Note: GLM-4.6 does not support image attachments. For image-based workflows, use Claude or GPT models. To use open-source models, you’ll need to configure them via BYOK with a local inference server (like Ollama) or a hosted provider. See BYOK documentation for setup instructions.

6 · Keep notes on what works

Track high-impact workflows (e.g., spec generation vs. quick edits) and which combinations of model + reasoning effort feel best.
Ping the community or your Factory contact when you notice a model regression so we can benchmark and update this guidance quickly.

Welcome

CLI

Web Platform

Enterprise

1 · Current stack rank (October 2025)

2 · Match the model to the job

3 · Switching models mid-session

4 · Reasoning effort settings

5 · Bring Your Own Keys (BYOK)

Open-source models

6 · Keep notes on what works

Welcome

CLI

Web Platform

Enterprise

​1 · Current stack rank (October 2025)

​2 · Match the model to the job

​3 · Switching models mid-session

​4 · Reasoning effort settings

​5 · Bring Your Own Keys (BYOK)

​Open-source models

​6 · Keep notes on what works

1 · Current stack rank (October 2025)

2 · Match the model to the job

3 · Switching models mid-session

4 · Reasoning effort settings

5 · Bring Your Own Keys (BYOK)

Open-source models

6 · Keep notes on what works