> ## Documentation Index
> Fetch the complete documentation index at: https://docs.factory.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Choosing Your Model

> Balance accuracy, speed, and cost by picking the right model and reasoning level for each CLI task.

Model quality evolves quickly, and we tune the CLI defaults as the ecosystem shifts. Use this guide as a snapshot of how the major options compare today, and expect to revisit it as we publish updates. This guide was last updated on Wednesday, March 11th 2026.

***

## 1 · Current stack rank (March 2026)

| Rank | Model                         | Why we reach for it                                                                                                                              |
| ---- | ----------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------ |
| 1    | **Claude Opus 4.7**           | Newest Anthropic flagship with **Max** reasoning; top pick for the hardest work. Promotional 1× multiplier through April 30 (2× afterward).      |
| 2    | **Claude Opus 4.6**           | Previous Anthropic flagship with **Max** reasoning; still excellent depth and safety for complex work.                                           |
| 3    | **Claude Opus 4.6 Fast**      | Opus 4.6 tuned for faster response times; 12× multiplier.                                                                                        |
| 4    | **Claude Opus 4.5**           | Proven quality-and-safety balance; strong default for TUI and exec.                                                                              |
| 5    | **Claude Sonnet 4.6**         | **Max** reasoning at the Sonnet price point (1.2×); strong daily driver for planning and implementation.                                         |
| 6    | **GPT-5.4**                   | Latest OpenAI model with 922K context, 128K output, verbosity support, and **Extra High** reasoning; excellent for large-context tasks.          |
| 7    | **Claude Sonnet 4.5**         | Strong daily driver with balanced cost/quality; great general-purpose choice when you don't need Opus-level depth.                               |
| 8    | **GPT-5.3-Codex**             | Newest OpenAI coding model with **Extra High** reasoning and verbosity support; strong for implementation-heavy tasks.                           |
| 9    | **GPT-5.2-Codex**             | Proven OpenAI coding model with **Extra High** reasoning; solid for implementation-heavy tasks.                                                  |
| 10   | **GPT-5.2**                   | OpenAI model with verbosity support and reasoning up to **Extra High**.                                                                          |
| 11   | **Claude Haiku 4.5**          | Fast, cost-efficient for routine tasks and high-volume automation.                                                                               |
| 12   | **Gemini 3.1 Pro**            | Newer Gemini Pro generation with strong structured outputs and mixed reasoning controls for research-heavy tasks.                                |
| 13   | **Gemini 3 Flash**            | Fast, cheap (0.2× multiplier) with full reasoning support; great for high-volume tasks where speed matters.                                      |
| 14   | **Droid Core (MiniMax M2.7)** | Open-source, 0.12× multiplier with reasoning support (Low/Medium/High) and image support; cheapest model available.                              |
| 15   | **Droid Core (GLM-5.1)**      | Open-source, 0.55× multiplier, newer GLM option for bulk automation and air-gapped environments; no image support.                               |
| 16   | **Droid Core (GLM-5)**        | Open-source, 0.4× multiplier, stable choice for bulk automation and air-gapped environments; no image support.                                   |
| 17   | **Droid Core (Kimi K2.6)**    | Open-source, 0.4× multiplier with image support and optional High reasoning; good for cost-sensitive work when you still want a thinking toggle. |
| 18   | **Droid Core (Kimi K2.5)**    | Open-source, 0.25× multiplier with image support; older Kimi option for cost-sensitive work.                                                     |

<Note>
  We ship model updates regularly. When a new release overtakes the list above, we update this page and the
  CLI defaults.
</Note>

***

## 2 · Match the model to the job

| Scenario                                                         | Recommended model                                                                                                                                                                                                                                                                             |
| ---------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Deep planning, architecture reviews, ambiguous product specs** | Start with **Opus 4.7** for best depth and safety (1× promotional multiplier through April 30), or fall back to **Opus 4.6** / **Opus 4.6 Fast** for faster turnaround. Use **Sonnet 4.6** or **Sonnet 4.5** when you want balanced cost/quality, or **GPT-5.4** for large-context reasoning. |
| **Full-feature development, large refactors**                    | **Opus 4.7** or **Opus 4.6** for depth and safety. **GPT-5.4**, **GPT-5.3-Codex**, or **GPT-5.2-Codex** when you need speed plus **Extra High** reasoning; **Sonnet 4.6** or **Sonnet 4.5** for balanced loops.                                                                               |
| **Repeatable edits, summarization, boilerplate generation**      | **Haiku 4.5** or **Droid Core** (including **MiniMax M2.7** at 0.12×) for speed and cost. **GPT-5.2** when you need higher quality or structured outputs.                                                                                                                                     |
| **CI/CD or automation loops**                                    | Favor **Haiku 4.5** or **Droid Core** for predictable, low-cost throughput. Use **GPT-5.3-Codex** or **GPT-5.4** when automation needs stronger reasoning.                                                                                                                                    |
| **High-volume automation, frequent quick turns**                 | **Haiku 4.5** for speedy feedback. **Droid Core** (especially **MiniMax M2.7** at 0.12× with reasoning) when cost is critical or you need air-gapped deployment.                                                                                                                              |

<Tip>
  **Claude Opus 4.7** is the newest top-tier option for extremely complex architecture decisions or critical
  work where you need maximum reasoning capability—and it runs at a promotional 1× multiplier through April 30
  (2× afterward). **Claude Opus 4.6** remains an excellent alternative, and **Opus 4.6 Fast** is tuned for
  faster responses at a higher cost. Most tasks don't require Opus-level power—start with Sonnet 4.6 or Sonnet
  4.5 and escalate only if needed.
</Tip>

Tip: you can swap models mid-session with `/model` or by toggling in the settings panel (`Shift+Tab` → **Settings**).

***

## 3 · Switching models mid-session

* Use `/model` (or **Shift+Tab → Settings → Model**) to swap without losing your chat history.
* If you change providers (e.g. Anthropic to OpenAI), the CLI converts the session transcript between Anthropic and OpenAI formats. The translation is lossy—provider-specific metadata is dropped—but we have not seen accuracy regressions in practice.
* For the best context continuity, switch models at natural milestones: after a commit, once a PR lands, or when you abandon a failed approach and reset the plan.
* If you flip back and forth rapidly, expect the assistant to spend a turn re-grounding itself; consider summarizing recent progress when you switch.

***

## 4 · Reasoning effort settings

* **Opus 4.7**: Off / Low / Medium / High / **Max** (default: High)
* **Opus 4.6 / Opus 4.6 Fast**: Off / Low / Medium / High / **Max** (default: High)
* **Sonnet 4.6**: Off / Low / Medium / High / **Max** (default: High)
* **Opus 4.5 / Sonnet 4.5 / Haiku 4.5**: Off / Low / Medium / High (default: Off)
* **GPT-5.4**: None / Low / Medium / High / **Extra High** (default: Medium)
* **GPT-5.2**: Off / Low / Medium / High / **Extra High** (default: Low)
* **GPT-5.2-Codex**: None / Low / Medium / High / **Extra High** (default: Medium)
* **GPT-5.3-Codex**: None / Low / Medium / High / **Extra High** (default: Medium)
* **Gemini 3.1 Pro**: Low / Medium / High (default: High)
* **Gemini 3 Flash**: Minimal / Low / Medium / High (default: High)
* **Droid Core (GLM-5)**: None only (default: None; no image support)
* **Droid Core (GLM-5.1)**: None only (default: None; no image support)
* **Droid Core (Kimi K2.6)**: Off / High (default: High)
* **Droid Core (Kimi K2.5)**: None only (default: None)
* **Droid Core (MiniMax M2.7)**: Low / Medium / High (default: High)

Reasoning effort increases latency and cost—start low for simple work and escalate as needed. **Max** is available on Claude Opus 4.7, the Opus 4.6 family (Opus 4.6 and Opus 4.6 Fast), and Sonnet 4.6. **Extra High** is available on GPT-5.4, GPT-5.2, GPT-5.2-Codex, and GPT-5.3-Codex.

<Tip>Change reasoning effort from `/model` → **Reasoning effort**, or via the settings menu.</Tip>

***

## 5 · Bring Your Own Keys (BYOK)

Factory ships with managed Anthropic and OpenAI access. If you prefer to run against your own accounts, BYOK is opt-in—see [Bring Your Own Keys](/cli/byok/overview) for setup steps, supported providers, and billing notes.

### Open-source models

**Droid Core (GLM-5)**, **Droid Core (GLM-5.1)**, **Droid Core (Kimi K2.6)**, **Droid Core (Kimi K2.5)**, and **Droid Core (MiniMax M2.7)** are open-source alternatives available in the CLI. They're useful for:

* **Air-gapped environments** where external API calls aren't allowed
* **Cost-sensitive projects** needing unlimited local inference
* **Privacy requirements** where code cannot leave your infrastructure
* **Experimentation** with open-source model capabilities

**Note:** GLM-5 and GLM-5.1 do not support image attachments. Kimi K2.5, Kimi K2.6, and MiniMax M2.7 do support images. Kimi K2.6 adds an Off/High reasoning toggle, while MiniMax M2.7 (the cheapest model available, with 0.12× multiplier) supports Low/Medium/High reasoning. For image-based workflows, use Claude, GPT, Kimi, or MiniMax M2.7.

To use open-source models, you'll need to configure them via BYOK with a local inference server (like Ollama) or a hosted provider. See [BYOK documentation](/cli/byok/overview) for setup instructions.

***

## 6 · Keep notes on what works

* Track high-impact workflows (e.g., spec generation vs. quick edits) and which combinations of model + reasoning effort feel best.
* Ping the community or your Factory contact when you notice a model regression so we can benchmark and update this guidance quickly.
