Skip to content

feat(cli) + fix(capture/lint/producer): on-demand video fetch, music + sfx commands, pipeline robustness#1438

Open
ukimsanov wants to merge 2 commits into
mainfrom
pr/cli-capture-w2h-fixes
Open

feat(cli) + fix(capture/lint/producer): on-demand video fetch, music + sfx commands, pipeline robustness#1438
ukimsanov wants to merge 2 commits into
mainfrom
pr/cli-capture-w2h-fixes

Conversation

@ukimsanov

@ukimsanov ukimsanov commented Jun 14, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • hyperframes capture-video — on-demand video downloader for entries in a capture's video-manifest.json. The capture pipeline writes the manifest + preview PNGs but deliberately skips the mp4s; this command pulls one entry at a time. SSRF-safe via safeFetch, 250 MB cap, content-type whitelist.
  • hyperframes music + hyperframes sfx — semantic search + add against the HeyGen catalog. add downloads into assets/{music,sfx}/, runs ffmpeg loudness analysis, prints a ready-to-paste <audio> snippet.
  • Capture pipeline fixes from real-AI-test runs: broaden isLogo to structural signals (class-substring alone caught 0/32 SVGs on heygen.com), content-hash SVG slugs, SVG→PNG rasterization before Gemini Vision (was hallucinating wordmarks), double-escape \/ inside the page.evaluate template literal.
  • Lint fixes: findRootTag masks <!-- --> / <style> / <script> ranges so a <video> token inside a CSS comment no longer gets picked as the composition root. New lintMissingLocalAsset rule (the most common sub-agent mistake across multi-URL runs).
  • Producer fix: ESM banner shims __dirname / __filename from import.meta.url — the ffmpeg/Emscripten wasm glue does scriptDirectory = __dirname + "/" and was crashing renders.

Net: 30 files, +2409 / -31. Zero package.json or lockfile changes.

Test plan

  • bun run --filter @hyperframes/cli typecheck passes
  • bun run --filter @hyperframes/cli test — 73/73 (sfx + music + lintProject suites)
  • node packages/producer/build.mjs esbuild bundles complete cleanly
  • hyperframes capture <url>capture-video <dir> --list--index 0 downloads cleanly
  • hyperframes sfx search "<q>" + add <id> writes assets/sfx/<id>.flac with loudness analysis printed
  • hyperframes music search "<q>" + add <id> writes assets/music/
  • Snippet output from each add passes hyperframes lint
  • Producer render of an existing project completes without __dirname errors

console.log(` from: ${entry.url}`);
try {
const buf = await fetchToBuffer(entry.url);
writeFileSync(outPath, buf);
console.log(` from: ${entry.url}`);
try {
const buf = await fetchToBuffer(entry.url);
writeFileSync(outPath, buf);
@ukimsanov ukimsanov force-pushed the pr/cli-capture-w2h-fixes branch 2 times, most recently from 7f9cd02 to 9795334 Compare June 14, 2026 13:50
…st runs

Surgical bugfixes accumulated across a series of real-AI-test runs
(heygen.com, huly.io, heygen-showcase). Each fix targets a specific
observed defect; happy paths are untouched.

packages/cli/src/capture/

  • assetCataloger.ts: surface three structural logo signals on every
    cataloged asset (inBanner / inHomeLink / matchesTitleBrand). The
    prior class-substring-only isLogo detector caught 0/32 SVGs on
    heygen.com and 0/19 on huly.io — modern React/Tailwind builds
    don't put "logo" or "brand" in any className. The new signals
    catch the universal "site header logo" pattern. Boolean merge
    semantics: any positive sample wins through context-merge +
    srcset dedup.

  • tokenExtractor.ts: broaden inline-SVG isLogo via the same three
    structural signals (header/nav/role=banner ancestor, root-href
    anchor parent, document.title brand-segment match in aria-label).
    No change to the existing class-substring detector — runs first,
    new heuristics only fire when it misses.

  • assetDownloader.ts: content-hash SVG slugs. SVG filenames are now
    `svg-<8char-sha1>.svg` (or `logo-<hash>.svg` when isLogo flags
    fire), replacing the previous label-derived slugging that
    mis-attributed brand carousels. Verified by rasterizing real
    captured SVGs: heygen-logo.svg actually contained the Google
    wordmark, hubspot-logo.svg contained Trivago, huly-logo.svg
    contained "Kube", heygen-logo.svg → "oogo". Catalog → URL label
    inference (aria-label / nearest-heading / sectionClasses) is too
    drift-prone across partner-logo carousels; content-hash names are
    invariant by construction.

  • contentExtractor.ts: SVG→PNG rasterization via sharp before
    sending to Gemini Vision. Previous path sent raw SVG markup as
    text and hit pure-hallucination output on wordmarks (VIVIENNE
    for HubSpot, "wrestling" for Workday). Vision models can read
    PNG pixels reliably; they cannot mental-render path commands.
    Adds polarity detection (white-glyph vs dark-glyph) so an SVG
    that flattens to a blank PNG against the wrong background gets
    inverted automatically before captioning.

  • contentExtractor.ts: LOGO tag in asset-descriptions.md lines
    when the structural signals fire (independent of Gemini). The
    no-Gemini-key fallback still emits an ⚠ banner + the LOGO-tagged
    lines so agents can grep for logos via filename pattern even
    without Vision.

  • index.ts: asset-descriptions.md header branches on Gemini-key
    presence with an explicit "Vision was OFF, descriptions are
    catalog-derived" warning + a fallback recipe ("open LOGO-tagged
    SVGs in a previewer before referencing"). Progress message also
    reports catalog-fallback mode.

  • capture/assetCataloger.ts + capture/tokenExtractor.ts regex
    escape: `/^https?:\\/\\/[^/]+\\/?$/` inside the page.evaluate
    template literal. The original `/^https?:\/\/[^/]+\/?$/` was
    collapsing `\/` to `/` inside the template (because backslash
    before a non-escape char is consumed), producing a parse error
    on every capture. Capture against heygen.com and huly.io both
    100% blocked on this until the escape was fixed.

packages/core/src/lint/utils.ts

  • findRootTag masks <!-- ... -->, <style>...</style>, and
    <script>...</script> ranges before tag extraction. A literal
    <video> token inside a CSS comment (`/* The card uses <video>
    as the surface */`) inside a <style> block was being picked as
    the composition root, producing two cascading false errors
    (root_missing_composition_id + root_missing_dimensions).
    Verified against a synthetic repro plus the real beat that hit
    this. Existing stripJsComments / extractScriptTextsAndSrcs
    exports preserved — earlier work-in-progress commits had
    accidentally removed them; they're consumers of these helpers
    in lint/rules/{adapters,composition,core,gsap}.ts.

packages/cli/src/utils/lintProject.ts

  • New lintMissingLocalAsset rule: scans <video>/<img>/<source>
    src attributes for local files that don't exist in the project.
    Uses resolveExistingLocalAsset (same helper stylesheet lint
    uses) so the existence check matches the bundler's notion of
    "resolves" — handles root-absolute "/assets/foo.png" relative
    to projectDir, rejects "../outside.png" that escapes the
    project. The renderer otherwise 404s these silently and ships
    a video with missing visuals. Empirically the most common
    sub-agent mistake across multi-URL runs (~5+ per run).

packages/producer/build.mjs

  • ESM banner shims __dirname / __filename from import.meta.url
    alongside the existing createRequire shim. Bundled CJS deps —
    notably the ffmpeg/Emscripten wasm glue, which does
    `scriptDirectory = __dirname + "/"` — were crashing at render
    time with "__dirname is not defined in ES module scope".
    Verified by re-running a producer render after rebuild;
    ffmpeg pipeline completes cleanly.

All targeted tests pass (255/255 across capture / lint / lintProject
/ sfx unit tests). Typecheck clean for @hyperframes/core and
@hyperframes/cli. No happy-path behavior change; each fix targets
a specific observed failure.
Three new CLI surfaces that the website-capture pipeline produces
output for but the CLI couldn't previously consume.

packages/cli/src/commands/capture-video.ts (NEW)

  • On-demand video downloader for entries in
    capture/extracted/video-manifest.json. The capture pipeline
    writes the manifest + preview PNGs but deliberately does NOT
    download the mp4s — a site with 12+ feature videos would
    balloon the capture from ~5 MB to hundreds of MB.

  • SSRF-safe: uses safeFetch() from capture/assetDownloader so
    a malicious manifest URL pointing at 169.254.169.254 (cloud
    metadata) or rfc1918 ranges is rejected, AND every redirect
    hop is re-validated (bare redirect:"follow" would only check
    the initial URL — a public host can 30x to a private one).

  • Content-type and size caps: 250 MB hard cap on body size
    (Content-Length AND actual bytes), strict whitelist of
    video/* + application/{mp4,octet-stream,x-mpegurl} content
    types. Anything else (HTML error page, JSON, tracking pixel)
    aborts cleanly with the actual content-type in the message.

  • Filename sanitization: decode percent-encoding from the
    manifest filename, then strip anything outside [A-Za-z0-9._-]
    before writing.

  • Lists with --list, picks by --index or --url. Idempotent
    (skips if already downloaded).

packages/cli/src/commands/sfx/* (NEW)

  • `hyperframes sfx search "<description>"` — Gemini-summarized
    semantic search across the HeyGen SFX catalog. Returns ranked
    matches with ids and licensable presigned URLs.

  • `hyperframes sfx add <id>` — downloads the match into the
    project's assets/sfx/ directory and runs ffmpeg loudness
    analysis (LUFS / peak / true-peak / clip detection) so the
    agent can place + balance it without ever hearing it.

  • Catalog manifest + auth flow + search-result caching match
    the existing cloud-client patterns.

  • Path-traversal guard: id is validated as [A-Za-z0-9_-]+ and
    the resolved destination must live under assets/sfx/. A
    crafted "../../etc/passwd" id would otherwise write outside
    the project.

  • Presigned URL refresh: cached URLs older than 10min are
    silently re-fetched before download.

packages/cli/src/commands/music/* (NEW)

  • Same shape as sfx but for the music catalog (`type: "music"`).
    Different default volume hint (0.45 for bed vs 0.3 for clip)
    in the success-message embed snippet. Same path-traversal
    guard.

packages/cli/src/cli.ts + help.ts

  • Register the three new subcommands (lazy-loaded) and add them
    to the right `hyperframes --help` groups so they actually
    show up in the help listing.

packages/cli/src/cloud/_gen/{client,types}.ts

  • Regenerated from the cloud OpenAPI. Adds `searchSounds`
    endpoint shape + types for SFX/music catalog. No breaking
    changes to existing endpoints.

package.json + bun.lock

  • Add `puppeteer` to cli deps (capture-video reuses
    capture/assetDownloader which transitively imports it for
    cookie-jar synchronization in safeFetch). bun.lock re-synced
    via `bun install`.

Tests pass (sfx catalog-manifest unit tests + the existing CLI
test suites). Each command has a working --help with examples.
@ukimsanov ukimsanov force-pushed the pr/cli-capture-w2h-fixes branch from 9795334 to 5db1909 Compare June 14, 2026 13:59
function maskNonScannableRanges(html: string): string {
let out = maskRange(html, /<!--[\s\S]*?-->/g);
out = maskRange(out, /<style\b[^>]*>[\s\S]*?<\/style>/gi);
out = maskRange(out, /<script\b[^>]*>[\s\S]*?<\/script>/gi);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants