Twill AI – Self-Improving Software Engineer

TL;DR

TL;DR: Twill is a cloud-based AI coding agent that maintains a persistent per-repo dev environment, learns project-specific memories and skills over time, and delivers PRs verified by tests and screenshots.

Source and Accuracy Notes

What Is Twill?

Twill is an AI software engineer that lives in the cloud and gets better at your codebase the more you use it. Unlike one-shot coding agents that start fresh every session, Twill maintains a persistent dev environment per repository — forked for every task so parallel work doesn’t conflict.

The core loop: you open a task, Twill forks your repo environment, makes changes, and writes back learnings (memories, setup fixes, skills) that future agents can reuse. PRs are verified with your actual test suite and screenshots before being marked done.

Setup Workflow

Step 1: Connect Your Repository

Visit twill.ai and sign in with GitHub. Authorize Twill to access your repos.

Step 2: Create a New Task

# Describe the task in plain English
# Twill forks the repo environment and starts working

Step 3: Review the PR

Twill opens a PR with your changes. The agent run log shows exactly what it did, what it learned, and what tests passed.

Deeper Analysis

How the Persistent Environment Works

Each repo gets its own sandboxed dev environment in Twill’s cloud. When you file a task, Twill forks that environment — so if a previous task left half-configured tooling, the new fork starts clean. But the knowledge layer persists across forks: memories, skill configs, and setup fixes written by prior agents are shared.

Memory and Skill System

Agents write three types of knowledge back:

Memories — project-specific context (naming conventions, architecture patterns)
Setup fixes — corrections to broken CI configs or missing deps
Skills — reusable prompt snippets for recurring task types

Future agents read this knowledge before touching code, so the second (and third, and nth) PR is faster and more accurate than the first.

Verification: Tests + Screenshots

Twill runs your existing test suite against every PR. For UI projects, it also captures screenshots and diffs them against the base branch — catching visual regressions that unit tests miss.

Practical Evaluation Checklist

[ ] Connects to GitHub and lists repos correctly
[ ] Forks environment on new task without conflicts
[ ] Writes memories that subsequent agents read
[ ] Test suite runs and reports pass/fail per PR
[ ] Screenshot diff captures visual changes
[ ] Knowledge layer improves across multiple tasks
[ ] No credential leakage — agents operate in sandboxed envs

Security Notes

Twill operates in isolated cloud environments per repository. Credentials are managed through OAuth, not stored as secrets. Agents run in sandboxed containers and cannot access other repos without explicit authorization. The memory layer is per-repo — a memory written in Repo A is not accessible to Repo B.

FAQ

Q: How does Twill differ from a one-shot coding agent like Cursor or Copilot? A: Cursor and Copilot work session-by-session with no persistent context. Twill maintains a knowledge layer across tasks — the more you use it, the more it knows about your codebase. It also pre-verifies PRs with your actual test suite before surfacing them.

Q: Does Twill work with private repos? A: Yes. You authorize Twill via GitHub OAuth and control access per-repo. It only sees repos you explicitly grant.

Q: What happens if an agent writes bad memories? A: Memories are stored but not automatically applied. You can review and edit them before they influence future agent runs. The system is designed to be additive, not destructive.

Conclusion

Twill solves the “fresh agent, cold context” problem by keeping a persistent, learning dev environment per repo. The memory and skill layers make each successive PR faster and more accurate. If you’re managing a codebase with recurring tasks — refactors, test coverage, documentation — Twill’s cloud agents can offload that work while learning your project’s conventions over time.

For small teams or solo devs tired of re-explaining context to every new agent session, Twill is worth evaluating. The YC S25 backing gives it runway to mature.

dev-tools

Automotive Skills Suite for AI Engineering

Evaluate Automotive Skills Suite for APQP, ASPICE, HARA, safety-plan, and DIA workflows with setup notes, governance risks, and SME review guidance.

5/28/2026

dev-tools

awesome-agentic-ai-zh Roadmap Guide

Explore awesome-agentic-ai-zh as a Chinese agentic AI learning roadmap, with setup notes, track selection, study workflow, and evaluation guidance.

5/28/2026

dev-tools

Baguette iOS Simulator Automation Guide

Set up Baguette for iOS Simulator automation, web dashboards, device farms, gesture input, streaming, and camera testing with Xcode caveats.

5/28/2026

TL;DR

Source and Accuracy Notes

What Is Twill?

Setup Workflow

Step 1: Connect Your Repository

Step 2: Create a New Task

Step 3: Review the PR

Deeper Analysis

How the Persistent Environment Works

Memory and Skill System

Verification: Tests + Screenshots

Practical Evaluation Checklist

Security Notes

FAQ

Conclusion

Related Posts