LLM Safety & Agent Controls
Factory treats LLMs as powerful but untrusted components. Droid’s safety model combines deterministic controls, programmable hooks, and sandboxed runtimes so that even frontier models can operate safely in high‑security environments.Two layers of safety
We differentiate between:- Deterministic controls – hard boundaries that do not depend on model behavior.
- LLM steering – prompts and context that guide models toward better behavior without providing guarantees.
Deterministic controls
- Command risk classification
- Global and project‑level allow/deny lists
- File and repo protection
- Droid Shield secret scanning
- Hooks for enforcement and logging
- Sandboxed VMs and containers
LLM steering
- Project and org‑level rules and instructions
- Standardized commands and workflows
- Context selection and enrichment via MCP
- Model choice and reasoning effort defaults
Command risk classification
Every shell command Droid proposes is classified into a risk level based on patterns and context (for example, file deletion, network access, package installation, database access). Typical levels include:- Low risk – read‑only commands and local diagnostics (for example,
ls,cat,git status). - Medium risk – commands that modify code or install dependencies (for example,
npm install,go test ./...). - High risk – commands that delete data, change system configuration, or interact with sensitive resources (for example,
rm -rf,kubectl delete,psqlagainst production).
- Always allow low‑risk commands.
- Require user approval for medium‑risk commands.
- Block high‑risk commands entirely or only allow them in specific environments (for example, isolated devcontainers).
Command allowlists and denylists
Admins can define global and project‑specific allow/deny lists in hierarchical settings:- Deny lists – patterns for commands that are never permitted (for example,
rm -rf,sudo *,curl *://sensitive-endpoint). - Allow lists – specific commands or patterns that can run without additional approval in certain environments.
- Org‑level denies and allows cannot be removed by projects or users.
- Projects can add additional allows or denies within their repos.
- Users cannot weaken the policy; they can only choose stricter personal defaults.
Droid Shield: secret scanning and DLP
Droid Shield adds a dedicated layer of protection for secrets and sensitive content. When enabled, Droid Shield can:- Scan files before they are read or edited.
- Scan prompts before they are sent to models.
- Scan commands before they are executed.
- API tokens and access keys.
- Passwords and database connection strings.
- Private keys and certificates.
- Organization‑specific identifiers (configurable patterns).
- Block the operation entirely.
- Redact the sensitive portions before continuing.
- Emit OTEL events and logs for security review.
- Call out to a customer DLP service via hooks for deeper analysis.
Hooks for enforcement and logging
Hooks are Droid’s programmable safety and observability interface. They allow you to run your own code at key points in the agent loop. Common hook points include:- Before prompt submission – inspect prompts for secrets, PII, or policy violations.
- Before file reads or writes – block or redact access to sensitive files.
- Before command execution – enforce approval workflows or block dangerous commands.
- Before git operations – prevent unauthorized
git pushor interactions with restricted branches. - After code generation or edits – run linters, security scanners, or internal compliance checks.
Block direct git pushes
Block direct git pushes
Require all code changes to flow through your existing tooling by blocking
git push in Droid sessions and instructing developers to use internal CL/PR tooling instead.Integrate with DLP and CASB systems
Integrate with DLP and CASB systems
Forward prompts and file snippets to your DLP or CASB APIs before they reach LLMs; deny or redact operations that violate policy.
Enforce environment‑specific policies
Enforce environment‑specific policies
Allow certain tools and commands only in CI or sandboxed environments, based on environment tags emitted by Droid.
Sandboxing with containers and VMs
Running Droid in sandboxed devcontainers and VMs lets you safely grant more autonomy where it is useful, without exposing production systems. Recommended patterns:- Use Docker/Podman devcontainers with restricted filesystem mounts and network egress.
- Run Droid with higher autonomy only inside containers or VMs that do not have direct access to production databases or secrets.
- Use separate credentials and environment variables for sandboxed versus production‑adjacent environments.
environment.type=local|ci|sandbox) to build environment‑specific dashboards and alerts.
Steering models toward safe behavior
While deterministic controls are the foundation, LLM steering improves quality and reduces how often dangerous actions are even proposed. Org and project settings can define:- Rules and instructions – security guidelines, coding standards, and organization‑specific instructions that apply to every request.
- Standardized commands and workflows – shared commands (for example,
/security-review,/migrate-service) that bake in safe patterns. - Context enrichment – MCP servers that expose documentation, runbooks, and internal APIs so models have accurate information instead of guessing.
