Boxes.dev - Cloud Dev Environments for AI Coding Agents
Boxes.dev runs every Claude Code and Codex thread on its own cloud computer, with desktop and mobile clients. No worktrees, no localhost limits, no cracked-open laptops.
TL;DR
TL;DR: Boxes.dev gives every Claude Code and Codex thread its own cloud computer with isolated filesystem, compute, and dev environment — desktop and mobile apps, no worktrees, no localhost limits.
Source and Accuracy Notes
- Official site: boxes.dev
- Launch thread: Show HN: Boxes.dev (40 pts, 4 Jun 2026)
- Founders: Nick and Drew (previously built Gem — co-founder/CTO and first hire)
- Status at time of writing: actively in beta, public waitlist
What Is Boxes.dev?
Boxes.dev is a cloud-only agentic dev environment (ADE) built for the post-localhost era of AI-assisted coding. The pitch is direct: every Codex and Claude Code chat runs on its own computer in the cloud, with a full snapshot of your dev environment, isolated from every other agent thread. The desktop and mobile apps are thin clients; the work happens on remote compute that you can leave running while you close your laptop.
It is positioned against the current state of the art in two ways:
- vs. Conductor / Codex desktop app — those run on your machine; Boxes.dev runs everything in the cloud.
- vs. coding on localhost with worktrees — the founders argue (convincingly, given how much of 2025 was spent on “swap to a worktree, run a build, swap back”) that the localhost workflow is now actively holding agentic coding back.
The founders built this because they spent the last year coding almost exclusively with Claude Code and Codex, and kept hitting the same wall: parallel agents competing for the same local resources, the same ports, the same filesystem. Git worktrees solved the file collision problem but not the resource collision problem, and certainly not the “I closed my laptop and my agent died” problem.
How It Works
Step 1: Onboard the repo
You point boxes.dev at a local repo. The onboarding agent scans your dev setup — language, dependencies, build commands, environment variables, services — and ports the entire environment to a cloud snapshot. You don’t have to write a Dockerfile, define a devcontainer, or hand-hold the agent through setting up Postgres. It reverse-engineers your current setup and packages it.
# From the desktop app — the boxes.dev CLI mirrors the UX
$ boxes init
# Scans local repo, builds cloud snapshot, links to your account
Step 2: Start a thread
Every Codex or Claude Code thread starts from a snapshot of the full setup, with its own filesystem and compute. The thread can run the full app end-to-end — start a server, hit endpoints, run migrations, click through a browser — without colliding with any other thread.
Pinned threads
├─ Off box conversation backups Fork 1 +607 13d
├─ Support multi repo PR Fork 1 +3k 11d
├─ Thread list pagination Fork +8 8d
├─ Multiple threads per fork Fork +5 6d
├─ Backend owned thread ki... Fork 2d
└─ Audit largest code files Fork +2.1k -16 2d
Today
├─ Claude subagent rende... Fork +1 15m
├─ Desktop chat white fl... Fork +389 17m
└─ Render Write tool file p... Fork +1 29m
Each Fork N is an isolated cloud environment. The first number in the badge is parallel agent count, the second is the line diff vs. the parent thread.
Step 3: Run on mobile
The mobile app is fully-featured, not a remote control. You can review diffs, comment on threads, fork new agent runs, and watch CI from your phone. The “mobile is an afterthought” complaint that haunts most dev tools is explicitly addressed in the launch post — coding really is “just texting” now, and boxes.dev treats it that way.
Why This Matters
The localhost bottleneck is real
Most teams running parallel Claude Code sessions today hit some combination of these failures:
- Port collisions between two threads both trying to bind
:3000 - A long-running build in thread A starving thread B
- Closing the laptop and discovering the macOS sleep killed the agent
- Running
git worktree addfor the 15th time this week
Cloud isolation isn’t a productivity nicety — it removes the coordination tax that currently caps how many parallel agents you can usefully run. The launch post explicitly says they were “hitting resource constraints when multiple parallel agents test their own work by running the full app locally.” That’s the bottleneck being removed.
Mirrored UX, not a new one
The team deliberately mirrors the Claude Code and Codex UX patterns that power users already know. You’re not learning a new product; you’re learning that “local” now means “a cloud VM you connect to.” Threaded conversations, fork-from-message, slash commands, file mentions — all the muscle memory transfers. This is a much smaller adoption cliff than something like Conductor (which has its own chat UX) or the Codex desktop app (which has a different sidebar pattern).
Mobile is a first-class client, not a viewer
Most dev tools ship a mobile app that is, in practice, a notification mirror plus a glorified remote desktop. boxes.dev’s mobile app is built around the actual workflow: read the agent’s last message, fork the thread, kick off a new review, reply in chat, share a session link in Slack. The launch post says “no handoffs or remote control” — and from the screenshots, it delivers.
Practical Evaluation Checklist
When deciding whether boxes.dev is worth the swap from your current setup:
- [ ] How many parallel Claude Code / Codex threads do you actually run per day? If the answer is one, the value prop is weaker. If it’s three or more, the value is immediate.
- [ ] Are you paying for your own Claude / Codex subscription? boxes.dev uses your existing subscription rather than charging per-token on top.
- [ ] Do you have repo-local services (Postgres, Redis, a sidecar API) that take meaningful setup time? Onboarding reverse-engineers these; the bigger the setup, the more time saved.
- [ ] Do you need to read or steer an agent from your phone? If yes, boxes.dev is one of the few tools that takes this seriously.
- [ ] Do you have CI / scheduled automations to wire up? Slack integration and scheduled automations are in beta.
- [ ] Are you okay with the cloud-only model? There is no local fallback. If your network drops, the threads keep running but you can’t steer them until you’re back online.
Security Notes
Because the work happens on remote compute, a few things shift:
- Source code is in the cloud. boxes.dev’s threat model and SOC 2 posture (or lack thereof) needs to match whatever your company requires for code repositories. Read their security page before connecting a private repo.
- Secrets live in the snapshot. Environment variables, API keys, database credentials — all of it lives in the cloud environment. Use a secrets manager; don’t paste production creds into a thread.
- Mobile access is a new surface. If you connect from a personal phone, anyone who unlocks that phone can read and fork your threads. Set a strong device passcode.
- Slack integration posts to channels. Make sure the agent’s output is sanitized before it lands in a shared channel — by default, agents can be very chatty.
FAQ
Q: Does boxes.dev replace my local dev environment entirely?
A: For coding, yes. You’ll still need a local browser, terminal for ad-hoc work, and whatever you use to read the boxes.dev output. But the actual claude code / codex runs are no longer on your machine.
Q: Can I use my existing Claude / Codex subscription? A: Yes. The launch post explicitly says “using your subscription.” You’re not paying boxes.dev for tokens on top.
Q: How is this different from Conductor or the Codex desktop app? A: Conductor and the Codex desktop app run agents on your machine. boxes.dev runs them on remote compute, with full isolation per thread. The mobile app is a first-class client, not a viewer.
Q: What happens if I close my laptop mid-thread? A: The thread keeps running on the cloud VM. You can reopen the app on any device — phone, another laptop, a borrowed machine — and pick up where you left off.
Q: Does it work with self-hosted LLMs? A: Not at launch. boxes.dev is built around Anthropic and OpenAI’s hosted coding agents. If you need self-hosted inference, this isn’t the right tool today.
Q: How much does it cost? A: Pricing wasn’t disclosed in the launch post. There’s a waitlist for the beta — sign up and they’ll share details.
Q: Can I run multiple repos in the same workspace?
A: The launch mentions a “Support multi repo PR” thread with +3k engagement, suggesting multi-repo is in active development but not necessarily the default.
Conclusion
Localhost made sense when one developer was running one build at a time. With three, five, or ten parallel Claude Code / Codex threads, it’s a coordination problem disguised as a productivity tool. Boxes.dev’s bet is that the next generation of dev environments won’t run on your laptop at all — they’ll run on remote compute, mirrored to a desktop and mobile client that just shows you the conversation. The launch post is matter-of-fact about this: “based on early feedback from beta testers, we’re increasingly sure that cloud is the future of agentic coding.”
Whether you agree or not, the product is a useful thought experiment: it forces you to ask what, exactly, your local machine is doing for you in 2026 that a cloud VM couldn’t do better. For most agentic-coding workflows, the honest answer is “not much.”