Skip to main content
Factory CLI supports custom model configurations through BYOK (Bring Your Own Key). Use your own OpenAI or Anthropic keys, connect to any open source model providers, or run models locally on your hardware. Once configured, switch between models using the /model command.
Your API keys remain local and are not uploaded to Factory servers. Custom models are only available in the CLI and won’t appear in Factory’s web or mobile platforms.
Model selector showing custom models Install the CLI with the 5-minute quickstart →

Configuration Reference

Add custom models in ~/.factory/config.json under the custom_models array.

Supported Fields

FieldRequiredDescription
model_display_nameHuman-friendly name shown in model selector
modelModel identifier sent via API (e.g., claude-sonnet-4-5-20250929, gpt-5-codex, qwen3:4b)
base_urlAPI endpoint base URL
api_keyYour API key for the provider. Can’t be empty.
providerOne of: anthropic, openai, or generic-chat-completion-api
max_tokensMaximum output tokens for model responses

Understanding Providers

Factory supports three provider types that determine API compatibility:
ProviderAPI FormatUse ForDocumentation
anthropicAnthropic Messages API (v1/messages)Anthropic models on their official API or compatible proxiesAnthropic Messages API
openaiOpenAI Responses APIOpenAI models on their official API or compatible proxies. Required for the newest models like GPT-5 and GPT-5-Codex.OpenAI Responses API
generic-chat-completion-apiOpenAI Chat Completions APIOpenRouter, Fireworks, Together AI, Ollama, vLLM, and most open-source providersOpenAI Chat Completions API
Factory is actively verifying Droid’s performance on popular models, but we cannot guarantee that all custom models will work out of the box. Only Anthropic and OpenAI models accessed via their official APIs are fully tested and benchmarked.
Model Size Consideration: Models below 30 billion parameters have shown significantly lower performance on agentic coding tasks. While these smaller models can be useful for experimentation and learning, they are generally not recommended for production coding work or complex software engineering tasks.

Prompt Caching

Factory CLI automatically uses prompt caching when available to reduce API costs:
  • Official providers (anthropic, openai): Factory attempts to use prompt caching via the official APIs. Caching behavior follows each provider’s implementation and requirements.
  • Generic providers (generic-chat-completion-api): Prompt caching support varies by provider and cannot be guaranteed. Some providers may support caching, while others may not.

Verifying Prompt Caching

To check if prompt caching is working correctly with your custom model:
  1. Run a conversation with your custom model
  2. Use the /cost command in Droid CLI to view cost breakdowns
  3. Look for cache hit rates and savings in the output
If you’re not seeing expected caching savings, consult your provider’s documentation about their prompt caching support and requirements.

Quick Start

Choose a provider from the left navigation to see specific configuration examples:
  • Fireworks AI - High-performance inference for open-source models
  • Baseten - Deploy and serve custom models
  • DeepInfra - Cost-effective inference for open-source models
  • Hugging Face - Connect to models on HF Inference API
  • Ollama - Run models locally or in the cloud
  • OpenRouter - Access multiple providers through a single interface
  • OpenAI & Anthropic - Use your own API keys for official models
  • Google Gemini - Access Google’s Gemini models

Using Custom Models

Once configured, access your custom models in the CLI:
  1. Use the /model command
  2. Your custom models appear in a separate “Custom models” section below Factory-provided models
  3. Select any model to start using it
Custom models display with the name you set in model_display_name, making it easy to identify different providers and configurations.

Troubleshooting

Model not appearing in selector

  • Check JSON syntax in ~/.factory/config.json
  • Restart the CLI after making configuration changes
  • Verify all required fields are present

”Invalid provider” error

  • Provider must be exactly anthropic, openai, or generic-chat-completion-api
  • Check for typos and ensure proper capitalization

Authentication errors

  • Verify your API key is valid and has available credits
  • Check that the API key has proper permissions
  • Confirm the base URL matches your provider’s documentation

Local model won’t connect

  • Ensure your local server is running (e.g., ollama serve)
  • Verify the base URL is correct and includes /v1/ suffix if required
  • Check that the model is pulled/available locally

Rate limiting or quota errors

  • Check your provider’s rate limits and usage quotas
  • Monitor your usage through your provider’s dashboard

Billing

  • You pay your provider directly with no Factory markup or usage fees
  • Track costs and usage in your provider’s dashboard
I