HeyGen HyperFrames: HTML to MP4 for AI Agents

On April 17, 2026, HeyGen open-sourced HyperFrames — a framework that lets AI coding agents write video the way they already write web pages. You describe the scene, Claude Code writes HTML with CSS and GSAP animations, and the CLI renders it to a pixel-perfect MP4. No timeline editor. No cloud API keys. No black-box models in the critical path.
If that sounds like a small tooling update, it is not. It is a bet that the future of programmatic video is not React components or After Effects plug-ins. It is the same HTML every agent has already been trained on.
Why HTML Is the Right Video Primitive for Agents
Most video frameworks assume a human is sitting at a timeline. Premiere, After Effects, Final Cut, even web-native tools like Descript — they are all built around a workspace a human clicks through. Agents do not click. They write text.
The two serious programmatic-video stacks before HyperFrames were Remotion and Motion Canvas. Both are excellent, both are React-first, and both require the agent to master a domain-specific composition model: useCurrentFrame, composition IDs, interpolation helpers. That is learnable, but it is not native to how large language models were trained.
HyperFrames flips the ergonomics. A scene is a plain HTML file with data- attributes describing timing and transitions. CSS handles layout and type. GSAP handles motion. Three.js handles 3D. The agent is writing the same web it has been writing for years — except the output is an MP4 instead of a browser tab.
HeyGen put its money where its thesis is. Their own launch video for HyperFrames was generated entirely in Claude Code using HyperFrames. The authoring loop was: prompt the agent, preview in the browser-based studio, render.
How the Rendering Pipeline Works
Under the hood, HyperFrames is a browser frame capture pipeline plus FFmpeg. The renderer launches a headless Chromium instance, loads the composition HTML, advances the clock deterministically, captures each frame at the target resolution, and muxes the frames into an MP4 with FFmpeg. Animations are not sampled live — they are stepped frame-by-frame, which means the output is reproducible across runs and across machines.
That determinism is the second thing that matters. When an agent is iterating on a 45-second launch video, you want the same input to produce the same pixels every time. Otherwise debugging becomes guesswork. HyperFrames treats the MP4 the way developers treat compiled binaries: deterministic output from versioned source.
The whole thing ships under Apache 2.0. There are no API keys, no rate limits, no per-minute rendering fees, and no vendor cloud in the loop. You run it locally, in Docker, or in a CI job.
Getting Started in Three Commands
The install is dead simple if you are using any skill-aware AI coding tool — Claude Code, Cursor, Codex, or Gemini CLI all work out of the box.
npx skills add heygen-com/hyperframesThat pulls in the HyperFrames skill bundle, which teaches your agent the framework-specific patterns: how to structure a composition, how to time animations, how to call the renderer. Your agent now knows HyperFrames even if it has never seen it before.
To generate a video, describe it in plain text. For example:
Create a 20-second product launch video for a SaaS tool called
"Noqta Ops". Open with the logo fading in over a dark gradient,
transition to three feature cards (analytics, automation, alerts)
with staggered GSAP entrances, and end with a blue CTA button.The agent writes the HTML composition, you review it in the preview studio, and the CLI renders the final MP4 locally. Three commands. Zero cloud calls. No after-the-fact editing.
HyperFrames vs. Remotion: Different Goals
This is not a Remotion killer, and it is worth being precise about where each tool wins. Remotion is the right choice if you are a React team building a video pipeline with heavy programmatic logic: data-driven videos, dynamic charts, per-user personalization at scale. The React component model gives you deep control and clean composition of complex state.
HyperFrames is the right choice if an agent is the primary author. The mental model is smaller. The rendering is deterministic without frame-counting gymnastics. The output works anywhere a browser can render. For launch videos, ads, tutorials, and social clips built by a solo operator plus an AI agent, HyperFrames removes friction that Remotion still has.
There is also a licensing dimension. Remotion's enterprise license kicks in at team size and revenue thresholds. HyperFrames is Apache 2.0 without tiers. For small teams and solo developers — the exact audience most likely to let an agent ship video for them — that matters.
What This Unlocks for Small Teams
The interesting thing is not that HyperFrames exists. It is what it implies about the shape of software going forward. Video has been the last holdout in agent-led content creation. Text, code, images, and design are all agent-native now. Video required either a human in a timeline or a black-box generative model that sometimes hallucinated a third hand.
HyperFrames makes video a build step. The same agent that writes your landing page copy, generates your Open Graph image, and ships your React components can now produce the launch video for that feature. For a two-person startup or a solo operator in the MENA region, that is a real shift — the video budget line item becomes "a few dollars of agent tokens and a CI render minute."
The open-source license matters here too. You can run this on your own infrastructure, fine-tune the skill bundle to your brand, pipe the output into your CMS, and never hand a customer's script to a third-party video API. That is the kind of control enterprises and regulated industries have been asking for since generative video became mainstream.
The Bigger Pattern
HyperFrames is part of a broader move in 2026: tools designed for agents instead of for humans assisted by agents. gh skill from GitHub ships agent skills as installable packages. Agent Skills with SKILL.md became the universal standard across 30+ tools. Now video joins the list. The interfaces are shifting from GUIs with AI bolted on to text-native surfaces that agents drive end-to-end.
If you ship marketing content, dev tutorials, or product launches — and especially if you are a small team competing with much bigger ones — HyperFrames is worth twenty minutes of your afternoon. Install it, let Claude Code write a launch video, and see whether the result changes your mental model of what an agent can ship.
The repo is public. The license is permissive. The install is one command. There has not been a better moment to let an agent ship your next video.
Discuss Your Project with Us
We're here to help with your web development needs. Schedule a call to discuss your project and how we can assist you.
Let's find the best solutions for your needs.