feat(cli): capture-video on-demand fetcher + capture pipeline robustness#1447
feat(cli): capture-video on-demand fetcher + capture pipeline robustness#1447ukimsanov wants to merge 1 commit into
Conversation
e4e8b44 to
b56b7c5
Compare
| capture: () => import("./commands/capture.js").then((m) => m.default), | ||
| "capture-video": () => import("./commands/capture-video.js").then((m) => m.default), |
There was a problem hiding this comment.
I wonder if capture-video could be done via a subcommand of capture?
capture video
or capture --video
There was a problem hiding this comment.
oh yeah actually capture video should work
There was a problem hiding this comment.
let me do that real quick
b56b7c5 to
8404517
Compare
| const buf = await fetchToBuffer(entry.url); | ||
| // `flag: "wx"` is exclusive-create — atomic check-and-create, no TOCTOU window. | ||
| try { | ||
| writeFileSync(outPath, buf, { flag: "wx" }); |
8404517 to
51c9c6c
Compare
For the hyperframes.dev website-to-video flow. Real-AI-test runs against
heygen.com, huly.io, and heygen-showcase surfaced two gaps: (1) capture's
logo / asset-captioning signals missed modern React/Tailwind builds; and
(2) there was no CLI surface to pull the videos the manifest references.
New command:
• `hyperframes capture-video <project>` — on-demand downloader for
entries in capture/extracted/video-manifest.json. Capture writes the
manifest + preview PNGs but skips the mp4s; this pulls one entry by
`--index N` (matched against the entry's `index` field, NOT array
offset — gaps are possible when a preview screenshot fails). SSRF-safe
via safeFetch, 250 MB cap, content-type whitelist, race-free
exclusive-create write. Layout-aware (handles both standalone capture
and W2H project layouts).
Capture pipeline fixes:
• Structural logo signals (assetCataloger + tokenExtractor): inBanner /
inHomeLink / matchesTitleBrand. Class-substring alone caught 0/32 SVGs
on heygen.com — modern builds don't put 'logo' / 'brand' in any
className.
• Content-hash SVG slugs (assetDownloader): `svg-<8char-sha1>.svg` —
label-derived slugs mis-attributed partner-logo carousels
(heygen-logo.svg actually contained Google, hubspot-logo.svg contained
Trivago, etc.). Content-hash names are invariant by construction.
• SVG → PNG rasterization before Gemini Vision (contentExtractor): the
raw-SVG-as-text path was hallucinating wordmarks (VIVIENNE for HubSpot,
'wrestling' for Workday). Adds polarity detection so a white-glyph SVG
flattened to a blank PNG gets inverted before captioning. LOGO tag in
asset-descriptions.md when structural signals fire (independent of
Gemini key presence).
• Double-escape \/ inside the page.evaluate template literal in
assetCataloger + tokenExtractor: the original `/^https?:\/\/.../`
collapsed to `/` mid-template and threw `Unexpected token ^`. Capture
was 100% blocked on this until the escape was fixed.
• `asset-descriptions.md` header branches on Gemini-key presence with
an explicit 'Vision OFF — catalog-derived descriptions' warning.
New lint rule:
• `lintMissingLocalAsset` (cli/utils/lintProject): scans <video> / <img>
/ <source> src for local files that don't exist in the project.
Empirically the most common sub-agent mistake across multi-URL runs
(~5+ per run). Uses `resolveExistingLocalAsset` so the existence check
matches the bundler's notion of 'resolves'. Masks comment / style /
script ranges before scanning so a literal `<img src=missing.png>`
inside a tutorial comment isn't reported.
Tests: 17 new for capture-video (safeFilename decoding/sanitization,
VIDEO_CONTENT_TYPE_RE accept/reject, pickManifestEntry index-field lookup
with gaps, URL-mismatch + bad-index rejection, --index over --url
priority); 70 cases under lintProject.test.ts covering the new rule and
existing rules.
Sibling PRs in this stack:
• #PR_A1 — fix(producer): __dirname ESM banner shim
• #PR_A2 — fix(core/lint): findRootTag masks comment/style/script
51c9c6c to
343d5ac
Compare
james-russo-rames-d-jusso
left a comment
There was a problem hiding this comment.
Verified at HEAD 343d5ac — layering on @via's review at b56b7c5 and James's inline ask.
Verdict: approve-with-concerns. James's "make it a capture subcommand" landed cleanly; 2 of @via's 6 items resolved, 2 partial, 2 open. None blocking, but the buffering-vs-streaming gap is worth a follow-up since it shifts from "cosmetic duplication" to "actual OOM surface" on a hostile CDN.
Status table
James's ask — capture-video → capture video subcommand
| ✅ | subCommands: { video: ... } wired in capture.ts; "capture-video" entry removed from cli.ts registry; file moved to commands/capture/video.ts; example updated to hyperframes capture video ./linear-video --index 0. Clean follow-through. |
@via's 6 findings at HEAD
| # | Finding | Status | Notes |
|---|---|---|---|
| 1 | Duplicate download primitive (buffers full 250 MB via arrayBuffer() vs cloud/download.ts:downloadToFile's streaming) |
❌ open | Still Buffer.from(await r.arrayBuffer()) at video.ts:55. See concern below — this is now a real safety gap, not just a duplication. |
| 2 | run() untested (manifest discovery, --list, EEXIST short-circuit, exit codes) |
❌ open | video.test.ts still covers 4 helpers; integration surface uncovered. Acceptable for an author-time CLI but worth a follow-up. |
| 3 | maskNonScannableRanges masking has no regression test |
✅ resolved | New case at lintProject.test.ts: "does not flag <img>/<video> tokens inside <!-- -->, <style>, or <script>". Good. |
| 4 | SVG polarity light/dark fill regex asymmetric | Light glyph regex broadened to #fff(fff)?|white|#[ef][ef][ef]|#[ef]{6} — now catches #eee, #efe, etc. Still misses CSS named colors (gainsboro, silver), rgb()/hsl(), and class-driven fills. Light/dark sides still asymmetric in shape, but the heygen-on-#eee failure mode @via flagged is now caught. |
|
| 5 | Sharp failure swallows to empty caption silently | Now increments svgsSkipped and emits progress("design", "skipped rasterizing N SVG(s) — fell back to label-derived") after the loop. User learns something failed. Open: bare catch {} still drops the actual error, so when one user reports "Vision says X" you can't diagnose which SVG broke or why. Suggest logging err.message per skipped file at progress("design", ...) verbosity. |
|
| 6 | CodeQL network data written to file needs suppression w/ rationale |
✅ resolved | video.ts:223 now reads // lgtm[js/http-to-file-access] — manifest-vetted URL, content-type whitelist, 250MB cap, SSRF-safe fetch. Rationale explicit. |
My additional findings (independent read)
Concerns
• video.ts:55 — Buffering 250 MB before checking byteLength is a real OOM surface, not just cosmetic. MAX_VIDEO_BYTES content-length pre-check (line ~38) is advisory: a hostile or misconfigured CDN can send content-length: 100 and then stream 2 GB. The post-arrayBuffer() byteLength check fires AFTER node has buffered the whole response — process memory has already spiked. Streaming with a running byte cap (the shape cloud/download.ts:downloadToFile uses) would refuse mid-stream. This strengthens @via's #1 from "duplicate primitive" to "shared streaming primitive would actually be safer." Not blocking for v1 against trusted CDNs, but flag for a follow-up before this is used against arbitrary user-controlled manifests.
• video.ts:217 — safeFilename is non-injective; collisions silently EEXIST. decodeURIComponent + [^A-Za-z0-9._-]+ → _ collapse means "hero 1.mp4" and "hero_1.mp4" both produce "hero_1.mp4". If a manifest has two such entries, the second writeFileSync({ flag: "wx" }) throws EEXIST and the code logs "already downloaded: ... (skipping)" — misleading because the bytes on disk are entry A's, not entry B's. The PR's own test covers "a b___c" → "a_b___c", which proves the collapse is lossy. Suggest prefixing the filename with entry.index (e.g. 0-hero_1.mp4) so the manifest's own gap-tolerant key disambiguates on disk too.
• commands/capture.ts:56 — Manifest schema is duck-typed. JSON.parse then directly indexed via manifest.map((e) => e.index) etc. If capture's manifest shape ever changes (extra fields fine, but renamed index → entryIndex, or wrapped in { entries: [...] }), capture-video silently breaks or misreports. No schemaVersion on the manifest, no shape check on read. Author-time tool, so a runtime crash is acceptable — but a one-line "expected manifest version N" check on read would future-proof this cheaply.
Nits
• commands/capture.ts:56 — if (url === "video") return; early-return relies on citty's "parent run fires after subcommand routing" behavior. Comment is helpful but consider: if a user ever captures https://video.example.com and the URL is normalized somewhere upstream, this could short-circuit incorrectly. Worth a if (args._?.[0] === "video") return;-style guard against subCommand name only (not URL value). (nit)
• contentExtractor.ts:285 — bare catch {} (no (err)) means sharp's real failure cause is unrecoverable for diagnosis. Combined with the new svgsSkipped count, you get "I skipped 3 SVGs" but no breadcrumb to which SVGs or why. Pair with @via's #5 fix: at minimum catch (err) and emit a progress("design", ...) line per file. (nit, builds on Via #5)
• video.ts:223 — synchronous writeFileSync of a 250 MB Buffer blocks the event loop for the duration of the disk write. Fine for a CLI, but if this primitive ever gets reused server-side, prefer the fs/promises writeFile async version. (nit, forward-looking)
Questions
• video.ts:38 — application/octet-stream in the accept-list is broad. A lot of static-asset CDNs do send application/octet-stream for .mp4, so this is pragmatic — but it also means any 500-page HTML error served with that mimetype passes the check. Do you have a fallback "first 4 bytes are an mp4/webm/mov magic header" sniff, or is the URL trust model strong enough that magic-byte verification isn't worth the complexity?
• Cross-package coupling — @via flagged the download primitive duplication. Are there other ones emerging? sharp is now used in contentExtractor.ts for SVG rasterization; is it used anywhere else in the cli/core packages? If so, a packages/core/src/imageOps.ts-shaped landing zone might be worth carving before the third caller appears.
What I didn't verify
• I didn't run the new test suite locally — assumed the green CI claim in the PR body is accurate.
• I didn't manually exercise capture video against a real captured project; trusted the manual test plan in the PR body and the helper-level unit coverage.
• I didn't trace the new beats CLI surface that landed alongside this PR (visible in the b56b7c5..343d5ac compare diff) — it's adjacent feature work, not the capture-video review scope. Worth a separate look if it's in this PR's surface.
Review by Rames D Jusso
What
hyperframes capture video <project>subcommand — on-demand mp4 fetch from the capture manifest.lintMissingLocalAssetrule — flags<video>/<img>/<source>srcpointing at missing local files.Why
heygen-logo.svgon main today contains Google's wordmark — label-derived slugs picked the wrong DOM context. The newwebsite-to-videoskill inherits this and would ship off-brand videos. Same root for: class-substringisLogocatching 0/32 SVGs on heygen.com, Vision hallucinating wordmarks from raw SVG path text, and capture being 100% blocked by a\/regex collapse inpage.evaluate.How
capture video: nested undercaptureper the team'sauth/cloud/lambdasubcommand pattern.safeFetch(SSRF + redirect-hop revalidation), 250 MB cap, content-type whitelist,flag: "wx"exclusive-create write, lookup byentry.index(not array offset — manifest can have gaps).svg-<8-char-sha1>.svg) so filename can never drift from content.sharprasterizes SVGs to PNG before Vision; polarity detection flips white-glyph SVGs to a dark background.lintMissingLocalAssetmasks<!-- -->/<style>/<script>ranges before scanning so commented-out examples aren't false positives.Test plan
bun run --filter @hyperframes/cli typecheck+testpass (17capture/videocases + 51lintProjectcases)hyperframes capture video <dir> --list+--index 0downloads mp4 from heygen + notion + webm from airbnb — all valid perfile(1)