Hugging Face

Connect to thousands of models hosted on Hugging Face’s Inference Providers. Learn more in the Inference Providers documentation.

Model Performance: Models below 30 billion parameters have shown significantly lower performance on agentic coding tasks. While HuggingFace hosts many smaller models that can be useful for experimentation, they are generally not recommended for production coding work. Consider using models with 30B+ parameters for complex software engineering tasks.

Configuration

Configuration examples for ~/.factory/config.json:

{
  "custom_models": [
    {
      "model_display_name": "GPT OSS 120B [HF Router]",
      "model": "openai/gpt-oss-120b:fireworks-ai",
      "base_url": "https://router.huggingface.co/v1",
      "api_key": "YOUR_HF_TOKEN",
      "provider": "generic-chat-completion-api",
      "max_tokens": 32768
    },
    {
      "model_display_name": "Llama 4 Scout 17B [HF Router]",
      "model": "meta-llama/Llama-4-Scout-17B-16E-Instruct:fireworks-ai",
      "base_url": "https://router.huggingface.co/v1",
      "api_key": "YOUR_HF_TOKEN",
      "provider": "generic-chat-completion-api",
      "max_tokens": 16384
    }
  ]
}

Getting Started

Sign up at huggingface.co
Get your token from huggingface.co/settings/tokens
Browse models at huggingface.co/models
Add desired models to your configuration

Notes

Model names must match the exact Hugging Face repository ID
Some models require accepting license agreements on HF website first
Large models may not be available on free tier

Welcome

CLI

Reference

Web Platform

Enterprise

Configuration

Getting Started

Notes

Welcome

CLI

Reference

Web Platform

Enterprise

​Configuration

​Getting Started

​Notes

Configuration

Getting Started

Notes