Skip to main content
Droid Control lets Droids operate software: launch apps, type commands, click buttons, record what happens, and produce polished video evidence of it. Built by Droids, for Droids.

What you get

Verify claims

Test whether a behavior claim is true and produce evidence either way. No staging, no advocacy — just investigation.

Run QA flows

Drive terminal CLIs, web apps, or Electron apps through end-to-end flows. Report pass/fail with annotated screenshots.

Record demos

Generate polished before/after comparison videos of PRs, complete with title cards, keystroke overlays, and window chrome.

Get started

Run /plugins in a Droid session, go to the Browse tab, find droid-control, and install it.
For video rendering, Remotion dependencies need a one-time install after adding the plugin:
droid plugin list --scope user
cd <plugin-path>/remotion && npm install
You also need the runtime tools for your use case (tuistory, agent-browser, ffmpeg, etc.). See Prerequisites for per-use-case install commands.

Commands

Droid Control adds three slash commands. Each handles the full workflow end-to-end: planning, execution, recording, and reporting.
Test a specific behavior claim and report findings with evidence.
/verify "ESC cancels streaming in bash mode"
Droid launches the app, attempts the claim, and reports what actually happened — with screenshots and text snapshots as evidence.
The droid is framed as an investigator, not an advocate. If the claim is false, that’s a valid finding. Anti-fabrication rules prevent staging evidence to match expected outcomes.

How /demo works

1

Understands the change

Fetches the PR description, diff, and linked ticket. Identifies what needs to be proven and what could be confused with existing behavior.
2

Plans the interaction

Scripts a sequence of actions that produces visible evidence the feature works. For comparison PRs, both branches run identical interactions so only the behavior differs.
3

Captures both branches

Launches recorded sessions on the baseline and candidate branches in parallel using worker subagents.
4

Composes the video

Renders a polished video via Remotion with title cards, window chrome, keystroke overlays, and effects. Six visual presets range from cinematic to utilitarian.
5

Verifies the output

Checks the final video against the original commitments before delivering.

Example output

Every video below was planned, recorded, and rendered entirely by a Droid.

Automation drivers

Droid Control supports three automation backends. The right one is selected automatically based on what you’re targeting.

tuistory

Virtual PTY automation. Default for terminal work. Playwright-style CLI with asciinema recording and forced truecolor output.

true-input

Real terminal emulator. Headless Wayland compositor (Linux), KVM/QEMU (Windows), or QEMU monitor (macOS). For when you need real rendering evidence.

agent-browser

Web and Electron apps. Playwright-backed CLI with Chrome DevTools Protocol support. Navigates pages, fills forms, clicks buttons, captures screenshots.

Video rendering

Demo and showcase videos are rendered with Remotion, a React-based video engine. The plugin includes 22 visual components and 6 presets.
PresetLookBest for
factoryWarm black, traffic lights, amber glowOfficial Factory content
factory-heroSame + gradient backgroundLanding pages, social
heroCool gradient, generous marginsNon-Factory marketing
macosDark, clean frameGeneral-purpose demos
presentationBlack, generous marginsSlide decks, talks
minimalNo window bar, tight marginsDocs embeds, inline clips
  • Warm radial backgrounds, floating particles, film grain overlay, color grading
  • Motion blur title-to-content transition
  • Animated window chrome with traffic lights and glassmorphic borders
  • Auto-scaled title/subtitle text
  • Spotlight overlays to highlight specific regions
  • Directed zoom for small text or details
  • Keystroke pills showing user actions
  • Section headers and transition sweeps

Architecture

The plugin uses a composition architecture with three layers:
  • Orchestrator — Routes each request through three independent lookups (target, stage, artifact) to determine which skills to load.
  • 10 atom skills — Self-contained background knowledge loaded on demand, split into drivers, targets, stages, and polish.
  • 3 commands — Parse arguments into commitments, then delegate to atoms via hybrid handoffs.
Every workflow flows through capture → compose → verify. Commands declare what to produce; atoms own how. Skills chain through explicit handoffs rather than hardcoded pipelines, so the droid follows the flow naturally.

Architecture deep dive

Design rationale: UX for droids, waterfall routing, task delegation, and hybrid handoffs.

Prerequisites

Only install what you need for your use case.
npm install -g tuistory                                # virtual PTY driver
pip install asciinema                                   # terminal recording
cargo install --git https://github.com/asciinema/agg   # .cast → .gif converter
sudo apt-get install -y ffmpeg                          # video processing
agent-browser install   # downloads Chromium
PlatformRequired tools
Linux/Waylandcage, wtype, a Wayland-compatible terminal
Windows (KVM)libvirt, qemu, KVM VM with SSH
macOS (QEMU)qemu, socat, macOS VM with SSH
Requires Node.js >= 18, Chrome/Chromium, ffmpeg, ffprobe, and agg.

See also

Source code

Full plugin source: skills, commands, scripts, and Remotion components.

Plugins

Learn how plugins work, how to install them, and how to build your own.

Automated QA skill

Deeper QA automation with CI integration, failure learning, and structured reports.

README

Quick start, command reference, and prerequisites.