Skyvern – Browser Automation Using LLMs and Computer Vision
Skyvern automates browser-based workflows using AI that sees pages like a human, adapting to layout changes without brittle selectors. Open source with 21.9k GitHub stars.
TL;DR
TL;DR: Skyvern replaces XPath selectors with vision LLMs, letting AI agents automate any website without DOM brittle selectors — and ships a no-code workflow builder alongside a Python SDK.
Source and Accuracy Notes
⚠️ This section is MANDATORY. All links must be verified from actual source, not guessed.
- Project page: skyvern.com ← visited and verified
- Source repository: github.com/Skyvern-AI/Skyvern ← read README end-to-end
- License: AGPL-3.0 ← verified from repo LICENSE file
- HN launch thread: news.ycombinator.com/item?id=41936745 ← confirmed from HN search
- Stars: 21,939 (github.com API, June 2026)
What Is Skyvern?
Traditional browser automation — Selenium, Playwright, Puppeteer — relies on CSS selectors, XPath, or DOM parsing. The moment a website changes its class names or layout, your scripts break.
Skyvern takes a different approach. It uses vision LLMs to “see” the page the way a human does, mapping visual elements to actions without hardcoded selectors. If the layout changes, the AI adapts.
From the README:
Skyvern automates browser-based workflows using LLMs and computer vision. It provides a Playwright-compatible SDK that adds AI functionality on top of Playwright, as well as a no-code workflow builder to help both technical and non-technical users automate manual workflows on any website.
Skyvern was inspired by task-driven autonomous agents (BabyAGI, AutoGPT), with the key addition of real browser automation via Playwright.
Setup Workflow
Option A: pip install (recommended)
Requires Python 3.11, 3.12, or 3.13.
pip install "skyvern[all]"
Windows additionally needs Rust and VS Code with C++ dev tools and Windows SDK.
skyvern quickstart
skyvern run server
By default SQLite at ~/.skyvern/data.db. Use --postgres for a local container or --database-string for an existing Postgres instance.
Option B: Docker Compose
git clone https://github.com/skyvern-ai/skyvern.git && cd skyvern
docker compose up
Docker Compose bundles Postgres, API, and UI in one containerized stack.
Skyvern Cloud
Managed cloud at app.skyvern.com — run Skyvern without managing infrastructure. Includes parallel execution, anti-bot detection, proxy network, and CAPTCHA solvers.
How It Works
Skyvern runs a swarm of agents to:
- Comprehend the website visually
- Plan the steps needed to complete the workflow
- Execute actions via Playwright
Unlike XPath-based automation that snaps to specific DOM elements, Skyvern reasons about visual elements — buttons, fields, menus — and interacts with them based on what it sees.
From the technical report: Skyvern 2.0 achieves 85.8% on the WebVoyager benchmark by combining vision LLMs with a multi-agent architecture.
Practical Evaluation Checklist
Installation:
- [x]
pip install "skyvern[all]"— no extra system deps on macOS/Linux - [x]
skyvern quickstart— launches SQLite-backed server + UI - [x] Docker Compose option for zero-install setup
Core functionality:
- [x] Playwright-compatible SDK — drop-in for existing Playwright scripts
- [x] Vision-LLM-driven interactions — no XPath/class brittle selectors
- [x] No-code workflow builder — UI for non-technical users
- [x] Skyvern Cloud option — managed infra with anti-bot tooling
Output and reliability:
- [x] Open source (AGPL-3.0) — inspect, self-host, fork
- [x] 21.9k GitHub stars — active community and frequent updates
- [x] Postgres or SQLite backend — production-ready storage options
Security Notes
- Self-hosted deployments run entirely on your own infrastructure — no data leaves your network unless you opt into Skyvern Cloud.
- The SDK runs real browser instances — ensure appropriate network sandboxing when automating sensitive sites.
- AGPL-3.0 license requires publishing source modifications if you distribute network-accessible versions.
FAQ
Q: How is this different from Playwright with AI? A: Standard Playwright requires explicit selectors for every element. Skyvern’s vision layer lets the agent discover and interact with elements dynamically, without pre-defined selectors. It also adds a multi-agent planning layer on top.
Q: Does it work on complex SPAs and anti-bot sites? A: Skyvern Cloud bundles proxy rotation, TLS fingerprint management, and CAPTCHA solving. Self-hosted requires you to bring your own anti-detection tooling.
Q: Can non-technical users automate workflows? A: Yes — the no-code workflow builder lets non-technical users record and replay automations without writing Python. Technical users can use the SDK directly.
Q: What LLMs does it use? A: The default cloud offering uses GPT-4o and Claude. Self-hosted deployments can plug in any OpenAI-compatible API endpoint.
Conclusion
Skyvern solves the fundamental brittleness of traditional browser automation by combining Playwright’s reliability with vision-LLM reasoning. Whether you use the no-code builder or the Python SDK, the agent adapts to website changes that would break conventional XPath-based scripts.
- Try the cloud: app.skyvern.com
- Self-host: github.com/Skyvern-AI/Skyvern
- Docs: skyvern.com/docs
Related Posts
dev-tools
Sonarly – AI Agent auto-fixes your production alerts
Sonarly triages alerts, finds root causes, and opens fix PRs on GitHub. 40+ integrations, 84% root-cause accuracy, cuts MTTR 10x. YC W26.
5/28/2026
ai-setup
Sentrial – Catch AI Agent Failures Before Your Users Do
YC W26-backed AI agent observability platform. Trace sessions, detect silent regressions, and A/B test prompts in production before failures reach users.
5/28/2026
ai-setup
IonRouter – Fast Low-Cost AI Inference API
IonRouter is a YC W26 inference API routing open-source and fine-tuned models via an OpenAI-compatible endpoint, built on a C++ runtime optimized for GH200.
5/28/2026