Mosaic – Agentic Video Editing With a Node-Based Canvas

TL;DR

TL;DR: Mosaic is a node-based AI video editing platform where you compose multimodal editing agents from visual building blocks, then let them autonomously cut, enhance, and reframe raw footage — all within a canvas you control.

Source and Accuracy Notes

Product: https://edit.mosaic.so (free tier available)
Docs: https://docs.mosaic.so
HN Discussion: Launch HN: Mosaic (YC W25) – Agentic Video Editing

What Is Mosaic?

Mosaic is a multimodal AI video editing platform built by two former Tesla engineers. The core idea is simple: instead of wrestling with a timeline and manually applying edits, you drop nodes onto a canvas — each node is a configurable video operation — wire them together, and let a multimodal AI agent execute the entire workflow end-to-end.

The node-based approach comes from visual programming tools like Blender’s compositor or Unreal’s Blueprints. Each node represents a discrete step: trim clips, add text overlays, detect shot types, apply reframes, generate captions. You can branch and parallelize — run the same raw footage through two different prompt variants simultaneously to A/B test the output.

Under the hood, the platform uses visual intelligence to understand what’s actually in your video: saliency maps, audio transcriptions, emotion detection, spoken-word timestamps, and shot-type classification all feed into the editing agent’s decisions.

Setting Up Your First Editing Agent

Step 1: Create a Free Account

Sign up at edit.mosaic.so. The free tier lets you upload videos, build workflows on the canvas, and use the inline timeline editor. Paid plans unlock node execution runs to cover model inference costs.

Step 2: Upload Raw Footage

Upload one or more video files. Mosaic processes them through its analysis pipeline, extracting:

Saliency maps (where the viewer’s eye naturally goes)
Audio transcripts and word-level timestamps
Shot type classification (wide, medium, close-up)
Object and action detection
Light level analysis

This analysis is what makes the agent “smart” — it knows not just what frames contain, but which moments are worth keeping.

Step 3: Design Your Workflow on the Canvas

Drag nodes from the palette onto the canvas and wire them together. Key node types:

Clip Trimmer — define in/out points on a source clip
Text Overlay — add dynamic captions, titles, or subtitles
Reframe — auto-crop to different aspect ratios (16:9 → 9:16 for Reels)
AI Enhance — apply upscaling, color grading, or stabilization
Audio Sync — align voiceover or music tracks
Clip Generator — create B-roll from text prompts

You can branch a single workflow to produce multiple outputs in parallel.

Step 4: Run and Review

Hit “Run” and watch the canvas animate through each node step. The agent processes the video according to your node configuration. Once complete, open the timeline editor to review and make fine adjustments before exporting.

Step 5: Export

Mosaic can export timeline state to DaVinci Resolve, Adobe Premiere Pro, and Final Cut Pro — so you’re not locked in if you want to do manual finishing work in a traditional NLE.

Deeper Analysis

Why Node-Based Rather Than Chat?

The founders tried a pure chat interface first. For video, it fell apart in two ways: long videos generate huge context windows that slow down model responses, and professional editors have repeatable workflows they want to save and reuse. A node graph solves both — you can save a “podcast-to-Reels” workflow as a template, branch it for variants, and re-run it on new footage without conversation overhead.

The Multimodal Intelligence Layer

What sets Mosaic apart from traditional automation is the depth of visual understanding built into the agent. It’s not just running ASR on the audio track — it’s doing saliency analysis to know what draws attention, mean movement analysis to identify camera motion, and shot-type classification to maintain visual consistency. The agent makes editing decisions based on what a human editor would internalize unconsciously.

Use Cases

Script-based cuts — feed the agent a transcript and have it create a talking-heads cut automatically
Clip repurposing — break long-form podcasts or webinars into short social clips
Dynamic captions — auto-generate styled captions synchronized to speech
A/B content testing — branch a workflow to compare two different editing styles from the same source
Content localization — dub with voice cloning and lip sync (on roadmap)

Practical Evaluation Checklist

[ ] Sign up and upload a test video (2–5 minutes works well)
[ ] Run the default “Auto-Edit” agent to see what the platform produces baseline
[ ] Build a custom 3-node workflow (trim + caption + reframe) and run it
[ ] Try branching the canvas to produce both 16:9 and 9:16 outputs in parallel
[ ] Export to DaVinci Resolve and inspect the timeline
[ ] Check the analysis panel to see what the model detected (saliency, shot types, transcript)

Security Notes

Mosaic processes user-uploaded video through cloud infrastructure. Video content is stored and analyzed on their servers. Review their privacy policy before uploading proprietary or sensitive footage. The platform does not claim end-to-end encryption for stored media.

API access via docs.mosaic.so allows programmatic workflow triggering — ensure any API keys are stored securely and scoped to minimal permissions.

FAQ

Q: Does Mosaic replace a traditional NLE like Premiere Pro or DaVinci Resolve?

A: Not entirely. Mosaic is best thought of as an autopilot that gets you 80–90% of the way to a finished edit. The canvas handles the creative decision-making, but you still export to a traditional NLE for manual fine-tuning, color grading, and final delivery.

Q: What video formats does Mosaic support?

A: The platform accepts common formats including MP4, MOV, and WebM. Output is typically H.264 MP4. Specific codec support beyond this should be confirmed via their docs as the platform evolves.

Q: How does pricing work?

A: The free tier covers account creation, video upload, canvas access, and the timeline editor. Node execution runs (when the AI actually processes your video) consume credits on paid plans. The founders mention this is to cover multimodal model inference costs.

Q: Can I use my own AI models with Mosaic?

A: Currently the platform uses Mosaic’s own multimodal models for analysis and editing decisions. Custom model integration is not publicly documented — check their API docs for any upcoming capability here.

Q: Does the platform work for screen recordings and tutorials?

A: Yes. The multimodal analysis handles screen-recorded content well, particularly for detecting text overlays, speaker changes, and action sequences. It’s a strong fit for teams that produce recurring video content like walkthroughs or demo libraries.

Conclusion

Mosaic is a genuinely new take on video editing — not a AI chatbot layered on top of a traditional timeline, but a canvas where the editing agent is built from composable, visual building blocks. The node-based approach means workflows are reusable, inspectable, and parallelizable in ways that chat-based editing simply can’t match.

The free tier is enough to get a real feel for the platform. If you produce video content regularly — podcasts, tutorials, social clips — it’s worth 30 minutes of experimentation. The gap between “raw footage” and “publishable clip” shrinks considerably when the editing agent understands what’s actually in your video.

dev-tools

awesome-agentic-ai-zh Roadmap Guide

Explore awesome-agentic-ai-zh as a Chinese agentic AI learning roadmap, with setup notes, track selection, study workflow, and evaluation guidance.

5/28/2026

ai-setup

Prism – AI Video Workspace and API for Creators (YC X25)

Prism is a YC X25 AI video platform combining generation, editing, and an API for workflow automation. Generate assets, edit on a timeline, and integrate via.

5/28/2026

dev-tools

Unfudged – Real-Time Filesystem Flight Recorder

Unfudged records every file change in real time, letting you rewind any file to any second without commits. Lightweight, local-first dev tool for instant.

5/28/2026

TL;DR

Source and Accuracy Notes

What Is Mosaic?

Setting Up Your First Editing Agent

Step 1: Create a Free Account

Step 2: Upload Raw Footage

Step 3: Design Your Workflow on the Canvas

Step 4: Run and Review

Step 5: Export

Deeper Analysis

Why Node-Based Rather Than Chat?

The Multimodal Intelligence Layer

Use Cases

Practical Evaluation Checklist

Security Notes

FAQ

Conclusion

Related Posts