dev-tools 7 min read

Propolis – Browser Agents That QA Your Web App Autonomously

Propolis runs swarms of AI browser agents that explore your web app, find bugs, and generate Playwright e2e tests for your CI pipeline. Here's how it works.

By
Share: X in
Propolis browser agent QA platform thumbnail

TL;DR

TL;DR: Propolis launches dozens of AI-powered browser agents that collaboratively explore your web app, surface bugs and friction points, and automatically generate Playwright e2e tests you can drop straight into CI.

Source and Accuracy Notes

What Is Propolis?

Propolis is a browser-agent-based QA platform built by a two-founder team (YC X25) with a decade of combined experience in software quality. The core idea: instead of writing and maintaining brittle Playwright or Selenium selectors by hand, you point Propolis at your web app and it deploys “swarms” of browser agents that simulate real user behavior.

These agents don’t just click through predefined paths — they collaborate, exploring user journeys you didn’t think to test, flagging friction points, and proposing e2e tests that capture what they found. The tests are output as Playwright-compatible scripts you can add directly to your CI pipeline.

The pitch is straightforward: deterministic tests are great for known behavior, but it’s impossible to manually script coverage for every user flow. Propolis gives you a “canary group” of AI users you can run before every deploy without risking impact on real users.

The Core Problem Propolis Solves

Traditional e2e testing has a fundamental tension: you either write narrow, brittle tests that break on every UI change, or you don’t write enough tests and ships slip through with untested flows. Playwright and Selenium are powerful, but maintaining selectors across a rapidly changing UI is a full-time job no one wants to do.

Propolis frames it differently: treat AI agents like a QA team you can scale up or down on demand. Run 10 agents for a quick sanity check, or 100 for a deep exploratory session before a major release.

Repo-Specific Setup Workflow

Propolis is a hosted product — no self-hosting required. Getting started takes about two minutes.

Step 1: Create an Account and Add Your First App

Sign up at app.propolis.tech and connect your web application via URL. Propolis supports apps behind authentication too — you can provide test credentials or use session cookies.

Step 2: Configure Your First “Swarm”

A swarm is a coordinated group of browser agents that explore your app together. You define:

  • Target URL(s): which pages to start from
  • Agent count: 10 for quick checks, 100+ for thorough exploration
  • Exploration depth: how many steps/iterations each agent runs
  • Check types: visual regressions, console errors, broken links, checkout flow completion, etc.

Step 3: Review Results and Accept Suggested Tests

After the swarm completes, Propolis presents:

  • Bug reports with screenshots and console logs
  • Friction points (confusing flows, missing error states)
  • Proposed Playwright test snippets for each discovered flow

You review and accept the tests you want to keep. Accepted tests are exported as .spec.ts files.

Step 4: Integrate with CI

Drop the exported Playwright tests into your existing Playwright setup:

# Install Propolis test runner
npm install @propolis/test-runner

# Run Propolis-generated tests alongside your existing suite
npx playwright test --project=propolis-generated

# Or run in CI with the Propolis CI helper
npx propolis-ci --deploy-preview-url $DEPLOY_URL

Propolis provides a GitHub Action and a generic CI script that runs after your preview deployment is live.

Deeper Analysis

How the Agent Swarms Work

Each swarm run deploys multiple browser agents that communicate with each other. Unlike a single Playwright script that follows a fixed path, agents in a Propolis swarm:

  1. Start from different entry points (homepage, marketing pages, deep links)
  2. Make probabilistic decisions about what to click next, weighted by page structure
  3. Share findings — if one agent finds a broken checkout flow, others avoid wasting time on it and explore elsewhere
  4. Propose tests for each distinct user journey they successfully complete

The LLM-evaluation layer is what makes this powerful. Agents can catch bugs that aren’t strictly “broken” — a shopping assistant that recommends an out-of-stock product, a search that returns zero results without a clear message, a form that silently fails on submit. These are the UX bugs that deterministic tests almost never catch.

Pricing

Propolis is production-ready at $1,000/month for unlimited use with active support. The team offers lower prices for smaller projects and capped-use plans for hobby projects. There’s no free tier yet, but you can request early-access pricing if you’re willing to give feedback.

What Sets It Apart From Playwright/Selenium?

| | Traditional Playwright | Propolis | |---|---|---| | Test maintenance | High — selectors break on UI changes | Low — agents re-explore, regenerate selectors | | Coverage | Limited to what you script | Explores paths you didn’t think to test | | Bug types caught | Known failures only | Known + UX bugs, silent failures, edge cases | | CI integration | Native | Native, tests export as Playwright specs | | Setup time | Hours to days | Minutes |

LLM-Evaluated “Checks”

One standout feature is the flexible check system. Because agents are powered by LLMs, you can define checks that would be impossible to script:

  • “Flag any product page where the ‘Add to Cart’ button doesn’t respond within 2 seconds”
  • “Report if a search returns zero results with no helpful message”
  • “Catch if the checkout flow accepts an invalid promo code without an error”

This is a meaningfully different testing paradigm — not just automating existing manual QA, but using AI to evaluate quality in ways humans can’t scale.

Practical Evaluation Checklist

  • [ ] Sign up and connect a staging URL (2 minutes)
  • [ ] Run a 10-agent swarm on a login flow
  • [ ] Review bug reports and test proposals
  • [ ] Accept one test and export to Playwright
  • [ ] Run exported test in local Playwright setup
  • [ ] Integrate with GitHub Actions using the Propolis CI helper
  • [ ] Compare test flakiness vs. manually written Playwright tests over 10 runs

Security Notes

  • Authentication credentials are stored encrypted and scoped to your project
  • Test runs execute in isolated browser contexts — no access to other customer data
  • All network requests made by agents are logged for your review
  • Propolis does not modify your application — agents are read-only observers by default

FAQ

Q: Does Propolis work with single-page applications (SPAs)?

A: Yes. Propolis runs actual browser instances (Chromium, headless) and handles JavaScript-heavy SPAs natively. It waits for network idle before evaluating page state, so client-rendered content is fully explored.

Q: How are the generated Playwright tests maintained?

A: When a test fails in CI, Propolis logs which agent discovered that flow. You can re-run that agent to re-explore the path and regenerate the test. For minor UI changes (button text, class names), the regeneration often produces working selectors without manual intervention.

Q: Does this replace my existing Playwright test suite?

A: No — it’s additive. Use Propolis for exploratory coverage and to discover bugs you didn’t know existed. Keep your deterministic, business-critical tests in Playwright. Propolis-generated tests are great for regression coverage of features you haven’t manually scripted.

Q: What’s the difference between a “swarm” and just running many Playwright tests in parallel?

A: Playwright tests follow fixed paths you define. A swarm explores probabilistically and shares findings across agents, so they coordinate to maximize coverage rather than redundantly repeating the same paths.

Q: Is there a self-hosted option?

A: Not at the moment. Propolis is a hosted service. The team is focused on making the hosted product reliable before investing in self-hosted infrastructure.

Conclusion

Propolis represents a meaningful shift in how teams can approach e2e testing. Rather than spending engineering hours maintaining selector-heavy Playwright suites, you get AI agents that explore your app like a curious user and hand you working tests. The swarm-based approach means you’re not just automating manual QA — you’re discovering bugs in flows you’d never think to test.

It’s priced for teams serious about QA ($1,000/month), and the Playwright export means it slots into existing pipelines without forcing a platform switch. If you’ve been burned by flaky e2e tests that break on every UI churn, Propolis is worth a two-minute evaluation run.

Try it: app.propolis.tech — first swarm is free.