From dcc9c9a54feee7bff9e4a2a13f0452cf8f0c9a66 Mon Sep 17 00:00:00 2001 From: James Date: Thu, 14 May 2026 01:16:30 +0000 Subject: [PATCH 1/3] fix(heygen): document v3 asset-upload requirement for local files MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The HeyGen Codex plugin's 'Try in Chat' default flow steered users into uploading a local photo, but the HeyGen app connector rejects local file:// paths — it only accepts hosted HTTPS URLs or existing asset_id values. The skill never performed the intermediate upload-to- asset-id step required by HeyGen's docs, so the default flow failed consistently after auth. Changes: - Drop the broken 'from this photo' defaultPrompt; lead with description- based creation, which works end-to-end. - Document the POST https://api.heygen.com/v3/assets upload flow (curl + multipart/form-data) in both skill SKILL.md files and the asset-routing references. Skills now instruct the agent to upload local files first and pass the returned asset_id into the app/CLI creation call. - Add a troubleshooting entry in each skill for the local-file-rejection symptom, with the upload-then-asset_id fix recipe. - Add privacyPolicyURL and termsOfServiceURL to plugin.json (resolves plugin-eval failures). - Bump plugin version 2.2.0 -> 2.2.1. Product follow-up (outside this repo): add a file-upload tool to the HeyGen MCP connector, or auto-promote local file inputs to asset_id before calling the photo/avatar tools. --- plugins/heygen/.codex-plugin/plugin.json | 10 ++++--- plugins/heygen/README.md | 4 ++- plugins/heygen/skills/heygen-avatar/SKILL.md | 30 ++++++++++++++----- .../skills/heygen-avatar/agents/openai.yaml | 2 +- .../heygen-avatar/references/asset-routing.md | 24 +++++++++++---- .../references/avatar-creation.md | 15 +++++----- .../references/troubleshooting.md | 10 +++++++ plugins/heygen/skills/heygen-video/SKILL.md | 13 ++++---- .../heygen-video/references/asset-routing.md | 24 +++++++++++---- .../references/troubleshooting.md | 10 +++++++ 10 files changed, 103 insertions(+), 39 deletions(-) diff --git a/plugins/heygen/.codex-plugin/plugin.json b/plugins/heygen/.codex-plugin/plugin.json index 5400e5e9..4645a19e 100644 --- a/plugins/heygen/.codex-plugin/plugin.json +++ b/plugins/heygen/.codex-plugin/plugin.json @@ -1,7 +1,7 @@ { "name": "heygen", - "version": "2.2.0", - "description": "Create HeyGen avatar videos and personalized video messages. Build a persistent digital identity from a photo, then generate presenter-led videos with your digital twin.", + "version": "2.2.1", + "description": "Create HeyGen avatar videos and personalized video messages. Build a persistent digital identity from a description or hosted photo URL, then generate presenter-led videos with your digital twin.", "author": { "name": "HeyGen", "email": "developers@heygen.com", @@ -27,13 +27,15 @@ "interface": { "displayName": "HeyGen", "shortDescription": "Avatar videos and personalized video messages", - "longDescription": "HeyGen Skills give your agent a face, a voice, and the ability to send video like a message. Use heygen-avatar to build a persistent digital identity from a photo and pick a voice, then heygen-video to generate identity-first presenter videos via the HeyGen v3 Video Agent pipeline (avatar resolution, aspect ratio correction, prompt engineering, and voice selection are handled automatically).", + "longDescription": "HeyGen Skills give your agent a face, a voice, and the ability to send video like a message. Use heygen-avatar to build a persistent digital identity from a written description or hosted photo URL and pick a voice, then heygen-video to generate identity-first presenter videos via the HeyGen v3 Video Agent pipeline (avatar resolution, aspect ratio correction, prompt engineering, and voice selection are handled automatically).", "developerName": "HeyGen", "category": "Design", "capabilities": ["Read", "Write"], "websiteURL": "https://heygen.com", + "privacyPolicyURL": "https://www.heygen.com/privacy", + "termsOfServiceURL": "https://www.heygen.com/terms", "defaultPrompt": [ - "Create my HeyGen avatar from this photo", + "Create my HeyGen avatar from a written description", "Make a 30-second intro video of myself", "Send a video update to my team about this week's progress" ], diff --git a/plugins/heygen/README.md b/plugins/heygen/README.md index 3d6d58d9..f7dcbc17 100644 --- a/plugins/heygen/README.md +++ b/plugins/heygen/README.md @@ -6,7 +6,7 @@ OpenAI Codex plugin for [HeyGen](https://heygen.com) — create AI avatar videos Two skills that chain together: -- **heygen-avatar** — turn a photo into a persistent digital twin. Handles avatar lookup, instant-avatar creation, voice selection (or voice cloning), and writes an `AVATAR` file the video skill reads back. +- **heygen-avatar** — create a persistent digital twin from a written description or a hosted photo URL. Handles avatar lookup, avatar creation, voice selection (or voice cloning), and writes an `AVATAR` file the video skill reads back. - **heygen-video** — generate identity-first presenter videos via the HeyGen v3 Video Agent pipeline. Encodes the prompting, asset routing, aspect-ratio correction, and avatar/voice resolution that good HeyGen videos need. - **HeyGen app reference** — `.app.json` points at the curated [HeyGen ChatGPT app](https://chatgpt.com/apps/heygen/asdk_app_69418aad55e08191aa5e437b649ca2e4). @@ -16,6 +16,8 @@ Installing the plugin connects the HeyGen ChatGPT app automatically (OAuth on fi If you'd rather not use the app, the skills also support the HeyGen CLI: install it from and export `HEYGEN_API_KEY` (get one at ). +Local file upload note: the current HeyGen app connector accepts hosted HTTPS media URLs or existing HeyGen `asset_id` values for avatar/photo creation. It does not upload local `file://` paths directly. For local photos or videos, upload first with `heygen asset create --file ` or `POST https://api.heygen.com/v3/assets` using `multipart/form-data`, then pass the returned `asset_id` into the app or CLI creation flow. + ## Source of truth The skills are authored in [`heygen-com/skills`](https://github.com/heygen-com/skills) (under `heygen-avatar/` and `heygen-video/` at the repo root) and mirrored here. The main structural delta in this mirror is the wrapping `skills/` parent directory required by the Codex plugin convention. File issues about skill content on that repo. diff --git a/plugins/heygen/skills/heygen-avatar/SKILL.md b/plugins/heygen/skills/heygen-avatar/SKILL.md index bc4438e8..67eb2a14 100644 --- a/plugins/heygen/skills/heygen-avatar/SKILL.md +++ b/plugins/heygen/skills/heygen-avatar/SKILL.md @@ -37,7 +37,7 @@ This skill reads and writes the following. No other files are accessed without e | Write | `AVATAR-.md` | Save new avatar identity after creation | | Write | `AVATAR-AGENT.md`, `AVATAR-USER.md` (symlinks) | Role aliases, see Phase 5 | | Temp write | `/tmp/heygen/uploads/` | Voice preview audio (downloaded for user playback, deleted after session) | -| Remote upload | HeyGen (via the app or `heygen asset create`) | User-provided photos uploaded to HeyGen for digital-twin creation | +| Remote upload | HeyGen (via CLI/API asset upload) | Local photos/videos uploaded to HeyGen before avatar creation | Assets are only uploaded to HeyGen when the user explicitly provides them. @@ -86,9 +86,11 @@ Try to read `SOUL.md` from the workspace root. **HeyGen app (preferred):** If the HeyGen app is available through the installed app integration, use it. The app authenticates via OAuth and runs against the user's existing HeyGen plan credits. +**Local media limitation:** The current HeyGen app connector does NOT upload local files. Its photo/video/avatar tools accept only hosted HTTPS URLs or existing HeyGen `asset_id` values. Never pass `file://`, absolute local paths, or Codex attachment paths directly to app tools. For local photos/videos, first upload with `heygen asset create --file ` or `POST https://api.heygen.com/v3/assets` using `multipart/form-data`; then pass `{ "type": "asset_id", "asset_id": "..." }` to the app or CLI creation call. If no CLI/API upload path is available, ask the user for an HTTPS image URL or permission to continue with prompt-only creation. + **CLI fallback:** If the app is not available AND the `heygen` binary runs cleanly (`heygen --version` exits 0), use it. Auth: `HEYGEN_API_KEY` env OR `heygen auth login` (persists to `~/.heygen/credentials`). If the CLI is missing, errors on `--version`, or auth is not set, silently skip it. -**Neither available:** Only if the app is unavailable AND the CLI doesn't work, tell the user once: "To use this skill, connect the HeyGen app or install the HeyGen CLI: `curl -fsSL https://static.heygen.ai/cli/install.sh | bash` then `heygen auth login`." +**Neither available:** Only if the app is unavailable AND the CLI doesn't work, tell the user once: "To use this skill, connect the HeyGen app or install the HeyGen CLI: `curl -fsSL https://static.heygen.ai/cli/install.sh | bash` then `heygen auth login`." If the only missing capability is local media upload, say that local photos need an HTTPS URL or a CLI/API asset upload first. **API:** v3 only. Never call v1 or v2 endpoints. @@ -229,7 +231,9 @@ Only run this step when Phase 0 target = **user** (real-person digital twin) OR - Otherwise, ask one sentence: *"Got a headshot? It gives better face consistency for videos of you. I can also generate from your description — just say 'skip.'"* Branch: -- **Photo provided** → upload via the HeyGen app or `heygen asset create --file `, then Type B (photo) creation in Phase 2. +- **Photo provided as local file/path** → upload via `heygen asset create --file ` or `POST https://api.heygen.com/v3/assets`, then Type B (photo) creation with the returned `asset_id`. +- **Photo provided as HTTPS URL or asset_id** → Type B (photo) creation in Phase 2. +- **Local photo but no upload path available** → ask for an HTTPS image URL or offer prompt-only creation. Do not pass the local path into the app connector. - **Skip** → Type A (prompt) creation in Phase 2. For agents and named characters, skip this entire step — go straight to Type A (prompt) creation. @@ -258,15 +262,25 @@ Prompt limit is 1000 characters. Be descriptive — include style, features, exp **Type B — From reference image:** -**App:** use the HeyGen app flow for photo avatar creation. -**CLI:** `heygen avatar create -d '{"type":"photo","name":"...","file":{"type":"url","url":"..."},"avatar_group_id":"..."}'` +**App:** use the HeyGen app flow for photo avatar creation only with an HTTPS URL or pre-uploaded `asset_id`. +**CLI:** `heygen avatar create -d '{"type":"photo","name":"...","file":{"type":"asset_id","asset_id":"..."},"avatar_group_id":"..."}'` File options for Type B: - `{ "type": "url", "url": "https://..." }` — public image URL - `{ "type": "asset_id", "asset_id": "" }` — from `heygen asset create --file ` -- `{ "type": "base64", "media_type": "image/png", "data": "" }` — inline -📖 **When to use each (URL vs asset_id vs base64), upload routing, and edge cases → [references/asset-routing.md](references/asset-routing.md)** +Do not pass local paths or `file://` URLs to the app connector. Upload local files to an `asset_id` first. + +Raw API upload example: +```bash +ASSET_ID=$(curl -s -X POST "https://api.heygen.com/v3/assets" \ + -H "X-Api-Key: $HEYGEN_API_KEY" \ + -F "file=@/path/to/headshot.jpg" | jq -r '.data.asset_id') +``` + +The v3 upload endpoint accepts `multipart/form-data`, auto-detects MIME type from file bytes, and returns `data.asset_id`. + +📖 **When to use each (URL vs asset_id), upload routing, and edge cases → [references/asset-routing.md](references/asset-routing.md)** **Response:** Returns `avatar_item.id` (look ID) and `avatar_item.group_id` (character identity). @@ -446,7 +460,7 @@ simply `cat AVATAR-AGENT.md` and get whatever the current agent's avatar is. - Missing SOUL.md/IDENTITY.md → conversational onboarding, write AVATAR file from answers - API fails → retry once, then ask user to check API key - Voice match poor → show all available voices, let user browse -- Asset upload fails → skip reference image, try prompt-only creation +- Asset upload unavailable or fails → ask for an HTTPS URL or skip reference image and try prompt-only creation - Existing avatar file with stale HeyGen IDs → offer to regenerate or keep 📖 **Known issues, retry patterns, broken voice previews, error → action mapping → [references/troubleshooting.md](references/troubleshooting.md)** diff --git a/plugins/heygen/skills/heygen-avatar/agents/openai.yaml b/plugins/heygen/skills/heygen-avatar/agents/openai.yaml index 2cbb19ba..d621110b 100644 --- a/plugins/heygen/skills/heygen-avatar/agents/openai.yaml +++ b/plugins/heygen/skills/heygen-avatar/agents/openai.yaml @@ -1,4 +1,4 @@ interface: display_name: "HeyGen Avatar" short_description: "Create reusable HeyGen avatar identities" - default_prompt: "Create a reusable HeyGen avatar for me from a photo or written description, then help me choose a matching voice." + default_prompt: "Create a reusable HeyGen avatar for me from a written description, then help me choose a matching voice." diff --git a/plugins/heygen/skills/heygen-avatar/references/asset-routing.md b/plugins/heygen/skills/heygen-avatar/references/asset-routing.md index 04921e91..9eeda95b 100644 --- a/plugins/heygen/skills/heygen-avatar/references/asset-routing.md +++ b/plugins/heygen/skills/heygen-avatar/references/asset-routing.md @@ -7,7 +7,7 @@ When the user provides files, URLs, or references, route each asset to the right | Path | What happens | When to use | |------|-------------|-------------| | **A: Contextualize → Prompt** | Read/analyze the asset, extract key info, bake into script. Video Agent never sees the original. | Reference material, auth-walled content, documents where the *information* matters more than the *visual*. | -| **B: Attach to API** | Upload the raw file via `files[]`. Video Agent analyzes, extracts graphics, uses as frames/B-roll. | Screenshots, branded assets, PDFs with important visual layouts, images the viewer should literally see. | +| **B: Attach to API** | Attach a file reference via `files[]` (`asset_id` or HTTPS URL). Video Agent analyzes, extracts graphics, uses as frames/B-roll. | Screenshots, branded assets, PDFs with important visual layouts, images the viewer should literally see. | | **A+B: Both** | Contextualize for script quality AND attach for visual use. | Long docs where you need to summarize but Video Agent should also have the full source. | ## Classification Flow @@ -50,20 +50,31 @@ When the user provides files, URLs, or references, route each asset to the right - Weave naturally into the script. Don't dump. Integrate. ### Path B (Attach) -Upload to HeyGen: +Upload local files to HeyGen before passing them to avatar or video tools: -**App:** upload through the HeyGen app's asset flow when available. -**CLI:** `heygen asset create --file /path/to/file.png` +**Important:** the current HeyGen app connector does not upload local files. It accepts hosted HTTPS URLs or existing HeyGen `asset_id` values. Never pass `file://`, absolute local paths, or Codex attachment paths directly to app tools. + +**CLI/API:** `heygen asset create --file /path/to/file.png` or `POST https://api.heygen.com/v3/assets` Max 32MB per file. Returns JSON with the new `asset_id`. -Or pass inline in `files[]`: +Raw API upload: +```bash +ASSET_ID=$(curl -s -X POST "https://api.heygen.com/v3/assets" \ + -H "X-Api-Key: $HEYGEN_API_KEY" \ + -F "file=@/path/to/file.png" | jq -r '.data.asset_id') +``` + +`POST /v3/assets` uses `multipart/form-data`, auto-detects MIME type from file bytes, and returns `data.asset_id`. + +Then pass one of these media references: ```json {"type": "url", "url": "https://example.com/image.png"} {"type": "asset_id", "asset_id": ""} -{"type": "base64", "data": "", "content_type": "image/png"} ``` +If a local file is provided and no CLI/API upload path is available, ask the user for an HTTPS URL or continue without the reference image. Do not retry with the raw local path. + ### Describe Asset Usage in Prompt Be SPECIFIC: - "Use the uploaded dashboard screenshot as B-roll when discussing analytics" @@ -84,3 +95,4 @@ In the learning log entry, record: - **URLs that fail:** Try the environment's standard web/content fetch capability. If login/paywall/404 → tell the user, ask for content directly. Never silently fabricate. - **HTML URLs cannot go in `files[]`.** Video Agent rejects `text/html`. Web pages are ALWAYS Path A only. - **Prefer download→upload→asset_id** over `files[]{url}`. HeyGen's servers often blocked by CDN/WAF. +- **Local paths must become asset IDs first.** App tools reject local file references. diff --git a/plugins/heygen/skills/heygen-avatar/references/avatar-creation.md b/plugins/heygen/skills/heygen-avatar/references/avatar-creation.md index 2899539e..ebc196bc 100644 --- a/plugins/heygen/skills/heygen-avatar/references/avatar-creation.md +++ b/plugins/heygen/skills/heygen-avatar/references/avatar-creation.md @@ -32,14 +32,16 @@ change). Only use Mode 1 (new character) for genuinely new identities. ### Photo avatar (from user's photo) -**App:** use the HeyGen app flow for photo avatar creation. +**App:** use the HeyGen app flow for photo avatar creation only when the photo is a hosted HTTPS URL or an existing HeyGen `asset_id`. The app connector does not upload local paths. + +**Local file:** first run `heygen asset create --file ` or `POST https://api.heygen.com/v3/assets`, then use the returned `asset_id`. **CLI:** ```bash heygen avatar create -d '{ "type": "photo", "name": "My Avatar", - "file": {"type": "url", "url": "https://example.com/headshot.jpg"}, + "file": {"type": "asset_id", "asset_id": ""}, "avatar_group_id": "" }' ``` @@ -72,7 +74,7 @@ Optional: up to 3 `reference_images` to anchor the generated appearance. ### Video avatar / digital twin (from a short recording) -**App:** use the HeyGen app flow for digital-twin creation from video. +**App:** use the HeyGen app flow for digital-twin creation from video only when the video is a hosted HTTPS URL or an existing HeyGen `asset_id`. Upload local recordings to `asset_id` first. **CLI:** ```bash @@ -88,7 +90,7 @@ heygen avatar create -d '{ ## File Input Formats -`file` accepts three forms: +`file` accepts these app-safe forms: ```jsonc // Public URL (no auth, no paywall) @@ -96,11 +98,10 @@ heygen avatar create -d '{ // Pre-uploaded asset (from `heygen asset create --file `) { "type": "asset_id", "asset_id": "" } - -// Inline base64 -{ "type": "base64", "data": "", "content_type": "image/png" } ``` +Do not pass local paths or `file://` URLs to the app connector. The broader API/CLI may support additional encodings, but local files should be converted to `asset_id` first for this plugin flow. + For when each is appropriate, see [`references/asset-routing.md`](asset-routing.md). diff --git a/plugins/heygen/skills/heygen-avatar/references/troubleshooting.md b/plugins/heygen/skills/heygen-avatar/references/troubleshooting.md index 224e2bff..ada183e0 100644 --- a/plugins/heygen/skills/heygen-avatar/references/troubleshooting.md +++ b/plugins/heygen/skills/heygen-avatar/references/troubleshooting.md @@ -69,6 +69,16 @@ Video Agent rejects `text/html` content type in the `files[]` array. Web pages ( --- +## Local File Paths Rejected by App Connector + +**Symptom:** Photo/avatar creation fails with an error saying the connector rejected a local photo path or only accepts HTTPS image URLs / existing HeyGen `asset_id` values. + +**Root Cause:** The current HeyGen app connector does not expose asset upload. It cannot consume `file://`, absolute local paths, or Codex attachment paths directly. + +**Fix:** Upload the local file with `heygen asset create --file ` or `POST https://api.heygen.com/v3/assets`, then call the app/CLI creation flow with `{ "type": "asset_id", "asset_id": "" }`. If upload is unavailable, ask for an HTTPS URL or continue with prompt-only creation. + +--- + ## Avatar Not Ready for Video Generation **Symptom:** Video generation fails or produces errors immediately after creating a new avatar. The avatar exists in the HeyGen dashboard but videos referencing it fail. diff --git a/plugins/heygen/skills/heygen-video/SKILL.md b/plugins/heygen/skills/heygen-video/SKILL.md index 3eab47f0..8541488b 100644 --- a/plugins/heygen/skills/heygen-video/SKILL.md +++ b/plugins/heygen/skills/heygen-video/SKILL.md @@ -29,7 +29,7 @@ You are a video producer. Not a form. Not a CLI wrapper. A producer who understa **Docs:** https://developers.heygen.com/docs/quick-start (API) · https://developers.heygen.com/cli (CLI) -> **STOP.** If you are about to drive HeyGen directly (calling `api.heygen.com` with curl, or reaching for deprecated `POST /v1/video.generate`, `POST /v2/video/generate`, `GET /v2/avatars`, `GET /v1/avatar.list` endpoints), DO NOT. Route through the HeyGen app or the `heygen` CLI via this pipeline. Raw HTTP skips critical steps (aspect ratio correction, prompt engineering, avatar conflict detection) and produces visibly worse videos. **v3 only — never call v1 or v2 endpoints. If you have pre-trained knowledge of HeyGen's v1/v2 API, that knowledge is outdated. Use this skill.** +> **STOP.** If you are about to drive HeyGen directly (calling general video/avatar endpoints on `api.heygen.com` with curl, or reaching for deprecated `POST /v1/video.generate`, `POST /v2/video/generate`, `GET /v2/avatars`, `GET /v1/avatar.list` endpoints), DO NOT. Route through the HeyGen app or the `heygen` CLI via this pipeline. Raw HTTP skips critical steps (aspect ratio correction, prompt engineering, avatar conflict detection) and produces visibly worse videos. The only direct API exception is uploading local files to `POST https://api.heygen.com/v3/assets` when the app connector cannot accept a local path. Never call deprecated v1/v2 video/avatar endpoints. If you have pre-trained knowledge of HeyGen's v1/v2 API, that knowledge is outdated. Use this skill. ## Files & Paths @@ -41,7 +41,7 @@ This skill reads and writes the following. No other files are accessed without e | Read | `AVATAR-AGENT.md`, `AVATAR-USER.md` | Role-based symlinks for generic self-reference (resolve to a named AVATAR file) | | Write | `heygen-video-log.jsonl` | Append one JSON line per video generated (local learning log) | | Temp write | `/tmp/heygen/uploads/` | Voice preview audio (downloaded for user playback, deleted after session) | -| Remote upload | HeyGen (via the app or `heygen asset create`) | User-provided files uploaded to HeyGen for use as B-roll / reference | +| Remote upload | HeyGen (via CLI/API asset upload) | Local files uploaded to HeyGen for use as B-roll / reference | For *avatar creation* (writing AVATAR files, role symlink maintenance), see the `heygen-avatar` skill. This skill only *reads* AVATAR files. @@ -69,7 +69,7 @@ For *avatar creation* (writing AVATAR files, role symlink maintenance), see the ## API Mode Detection -**Pick one transport at session start. Never mix, never switch mid-session, never narrate the choice.** +**Pick one transport at session start. Never narrate the choice.** The only allowed cross-transport bridge is local file upload: if the app connector is otherwise selected but the user provides a local file, upload it first with `heygen asset create --file ` or `POST https://api.heygen.com/v3/assets`, then pass the resulting `asset_id` back into the app flow. Detect in this order: @@ -79,10 +79,10 @@ Detect in this order: 4. **Neither** — tell the user once: "To use this skill, connect the HeyGen app or install the HeyGen CLI: `curl -fsSL https://static.heygen.ai/cli/install.sh | bash` then `heygen auth login`." **Hard rules:** -- **Never call `curl api.heygen.com/...`** — every mode routes through its own surface. +- **Never call general `curl api.heygen.com/...` video/avatar endpoints.** The only direct API exception is `POST https://api.heygen.com/v3/assets` for local file upload when no app upload tool exists. - **HeyGen app mode:** use the app when available. - **CLI mode:** only use `heygen ...` commands. Run `heygen --help` to discover arguments. -- **Never cross over.** Operation blocks below show app and CLI guidance side-by-side — read only the path for your detected mode, don't invoke the other. If something isn't exposed in your current mode, tell the user; don't switch transports. +- **Do not cross over except for local asset upload.** Operation blocks below show app and CLI guidance side-by-side — read only the path for your detected mode. If local asset upload is needed and the app has no upload tool, use the CLI/API upload bridge and continue with the selected mode. ### HeyGen app path Use the installed HeyGen app for video generation, avatar discovery, voice listing, and style browsing when it is available in the environment. @@ -143,7 +143,7 @@ Interview the user. Be conversational, skip anything already answered. Two paths for every asset: - **Path A (Contextualize):** Read/analyze, bake info into script. For reference material, auth-walled content. -- **Path B (Attach):** Upload to HeyGen via `heygen asset create --file ` (or include as `files[]` entries on video-agent create). For visuals the viewer should see. +- **Path B (Attach):** Upload local files to HeyGen via `heygen asset create --file ` or `POST https://api.heygen.com/v3/assets`, then pass the returned `asset_id`. For visuals the viewer should see. - **A+B (Both):** Summarize for script AND attach original. 📖 **Full routing matrix and upload examples → [references/asset-routing.md](references/asset-routing.md)** @@ -151,6 +151,7 @@ Two paths for every asset: **Key rules:** - HTML URLs cannot go in `files[]` (Video Agent rejects `text/html`). Web pages are always Path A. - Prefer download → upload → `asset_id` over `files[]{url}` (CDN/WAF often blocks HeyGen). +- The current HeyGen app connector rejects local `file://` paths. Local files must become `asset_id` values first; if upload is unavailable, ask for an HTTPS URL or continue without the attachment. - If a URL is inaccessible, tell the user. Never fabricate content from an inaccessible source. - **Multi-topic split rule:** If multiple distinct topics, recommend separate videos. diff --git a/plugins/heygen/skills/heygen-video/references/asset-routing.md b/plugins/heygen/skills/heygen-video/references/asset-routing.md index 04921e91..0ff959f6 100644 --- a/plugins/heygen/skills/heygen-video/references/asset-routing.md +++ b/plugins/heygen/skills/heygen-video/references/asset-routing.md @@ -7,7 +7,7 @@ When the user provides files, URLs, or references, route each asset to the right | Path | What happens | When to use | |------|-------------|-------------| | **A: Contextualize → Prompt** | Read/analyze the asset, extract key info, bake into script. Video Agent never sees the original. | Reference material, auth-walled content, documents where the *information* matters more than the *visual*. | -| **B: Attach to API** | Upload the raw file via `files[]`. Video Agent analyzes, extracts graphics, uses as frames/B-roll. | Screenshots, branded assets, PDFs with important visual layouts, images the viewer should literally see. | +| **B: Attach to API** | Attach a file reference via `files[]` (`asset_id` or HTTPS URL). Video Agent analyzes, extracts graphics, uses as frames/B-roll. | Screenshots, branded assets, PDFs with important visual layouts, images the viewer should literally see. | | **A+B: Both** | Contextualize for script quality AND attach for visual use. | Long docs where you need to summarize but Video Agent should also have the full source. | ## Classification Flow @@ -50,20 +50,31 @@ When the user provides files, URLs, or references, route each asset to the right - Weave naturally into the script. Don't dump. Integrate. ### Path B (Attach) -Upload to HeyGen: +Upload local files to HeyGen before passing them to avatar or video tools: -**App:** upload through the HeyGen app's asset flow when available. -**CLI:** `heygen asset create --file /path/to/file.png` +**Important:** the current HeyGen app connector does not upload local files. It accepts hosted HTTPS URLs or existing HeyGen `asset_id` values. Never pass `file://`, absolute local paths, or Codex attachment paths directly to app tools. + +**CLI/API:** `heygen asset create --file /path/to/file.png` or `POST https://api.heygen.com/v3/assets` Max 32MB per file. Returns JSON with the new `asset_id`. -Or pass inline in `files[]`: +Raw API upload: +```bash +ASSET_ID=$(curl -s -X POST "https://api.heygen.com/v3/assets" \ + -H "X-Api-Key: $HEYGEN_API_KEY" \ + -F "file=@/path/to/file.png" | jq -r '.data.asset_id') +``` + +`POST /v3/assets` uses `multipart/form-data`, auto-detects MIME type from file bytes, and returns `data.asset_id`. + +Then pass one of these media references: ```json {"type": "url", "url": "https://example.com/image.png"} {"type": "asset_id", "asset_id": ""} -{"type": "base64", "data": "", "content_type": "image/png"} ``` +If a local file is provided and no CLI/API upload path is available, ask the user for an HTTPS URL or continue without the visual attachment. Do not retry with the raw local path. + ### Describe Asset Usage in Prompt Be SPECIFIC: - "Use the uploaded dashboard screenshot as B-roll when discussing analytics" @@ -84,3 +95,4 @@ In the learning log entry, record: - **URLs that fail:** Try the environment's standard web/content fetch capability. If login/paywall/404 → tell the user, ask for content directly. Never silently fabricate. - **HTML URLs cannot go in `files[]`.** Video Agent rejects `text/html`. Web pages are ALWAYS Path A only. - **Prefer download→upload→asset_id** over `files[]{url}`. HeyGen's servers often blocked by CDN/WAF. +- **Local paths must become asset IDs first.** App tools reject local file references. diff --git a/plugins/heygen/skills/heygen-video/references/troubleshooting.md b/plugins/heygen/skills/heygen-video/references/troubleshooting.md index 224e2bff..08739a0e 100644 --- a/plugins/heygen/skills/heygen-video/references/troubleshooting.md +++ b/plugins/heygen/skills/heygen-video/references/troubleshooting.md @@ -69,6 +69,16 @@ Video Agent rejects `text/html` content type in the `files[]` array. Web pages ( --- +## Local File Paths Rejected by App Connector + +**Symptom:** Video creation fails or a `files[]` attachment is rejected because the connector won't accept a local file path or `file://` URL for B-roll, reference images, or screen captures. + +**Root Cause:** The current HeyGen app connector does not expose asset upload. It cannot consume `file://`, absolute local paths, or Codex attachment paths directly. `files[]` only accepts hosted HTTPS URLs or existing HeyGen `asset_id` values. + +**Fix:** Upload the local file with `heygen asset create --file ` or `POST https://api.heygen.com/v3/assets`, then pass `{ "type": "asset_id", "asset_id": "" }` (or the bare `asset_id` string where required) into the video-agent `files[]` array. If upload is unavailable, ask for an HTTPS URL or continue without the visual attachment. + +--- + ## Avatar Not Ready for Video Generation **Symptom:** Video generation fails or produces errors immediately after creating a new avatar. The avatar exists in the HeyGen dashboard but videos referencing it fail. From e2c5a6a1948610920dfde750bb17fc961ec21daa Mon Sep 17 00:00:00 2001 From: James Date: Thu, 14 May 2026 21:25:22 +0000 Subject: [PATCH 2/3] heygen: harden Codex skill reliability and fallback guidance --- plugins/heygen/skills/heygen-avatar/SKILL.md | 4 + .../references/troubleshooting.md | 31 ++++++++ plugins/heygen/skills/heygen-video/SKILL.md | 23 +++++- .../references/avatar-discovery.md | 18 +++++ .../references/troubleshooting.md | 79 +++++++++++++++++++ 5 files changed, 152 insertions(+), 3 deletions(-) diff --git a/plugins/heygen/skills/heygen-avatar/SKILL.md b/plugins/heygen/skills/heygen-avatar/SKILL.md index 67eb2a14..046b869c 100644 --- a/plugins/heygen/skills/heygen-avatar/SKILL.md +++ b/plugins/heygen/skills/heygen-avatar/SKILL.md @@ -86,10 +86,14 @@ Try to read `SOUL.md` from the workspace root. **HeyGen app (preferred):** If the HeyGen app is available through the installed app integration, use it. The app authenticates via OAuth and runs against the user's existing HeyGen plan credits. +**Auth triage (run immediately):** run `command -v heygen` and `heygen auth status`. If app/MCP auth fails but CLI auth is valid, continue in CLI mode for this run. + **Local media limitation:** The current HeyGen app connector does NOT upload local files. Its photo/video/avatar tools accept only hosted HTTPS URLs or existing HeyGen `asset_id` values. Never pass `file://`, absolute local paths, or Codex attachment paths directly to app tools. For local photos/videos, first upload with `heygen asset create --file ` or `POST https://api.heygen.com/v3/assets` using `multipart/form-data`; then pass `{ "type": "asset_id", "asset_id": "..." }` to the app or CLI creation call. If no CLI/API upload path is available, ask the user for an HTTPS image URL or permission to continue with prompt-only creation. **CLI fallback:** If the app is not available AND the `heygen` binary runs cleanly (`heygen --version` exits 0), use it. Auth: `HEYGEN_API_KEY` env OR `heygen auth login` (persists to `~/.heygen/credentials`). If the CLI is missing, errors on `--version`, or auth is not set, silently skip it. +In Codex desktop/sandboxed runs, CLI network calls may fail with DNS/network errors until network approval/escalation is granted. Retry with escalation when needed. + **Neither available:** Only if the app is unavailable AND the CLI doesn't work, tell the user once: "To use this skill, connect the HeyGen app or install the HeyGen CLI: `curl -fsSL https://static.heygen.ai/cli/install.sh | bash` then `heygen auth login`." If the only missing capability is local media upload, say that local photos need an HTTPS URL or a CLI/API asset upload first. **API:** v3 only. Never call v1 or v2 endpoints. diff --git a/plugins/heygen/skills/heygen-avatar/references/troubleshooting.md b/plugins/heygen/skills/heygen-avatar/references/troubleshooting.md index ada183e0..a3278b82 100644 --- a/plugins/heygen/skills/heygen-avatar/references/troubleshooting.md +++ b/plugins/heygen/skills/heygen-avatar/references/troubleshooting.md @@ -79,6 +79,37 @@ Video Agent rejects `text/html` content type in the `files[]` array. Web pages ( --- +## App Auth Broken, CLI Auth Works + +**Symptom:** App/MCP calls fail with token invalid/expired errors, while CLI commands work on the same machine. + +**Fix:** Run: +```bash +command -v heygen +heygen auth status +``` +If CLI auth is valid, continue in CLI mode for the current run. + +--- + +## Sandbox DNS/Network Failures in Codex + +**Symptom:** CLI commands fail with DNS/network errors despite valid auth. + +**Root Cause:** Network-restricted sandbox execution. + +**Fix:** Rerun the same command with network approval/escalation. + +--- + +## CLI Telemetry Noise in Sandboxed Runs + +**Symptom:** Analytics/telemetry DNS warnings (for example PostHog) clutter command output. + +**Fix:** If supported by the installed CLI version, disable analytics for agent runs to reduce noise. If not supported, ignore telemetry warnings unless command exit status indicates failure. + +--- + ## Avatar Not Ready for Video Generation **Symptom:** Video generation fails or produces errors immediately after creating a new avatar. The avatar exists in the HeyGen dashboard but videos referencing it fail. diff --git a/plugins/heygen/skills/heygen-video/SKILL.md b/plugins/heygen/skills/heygen-video/SKILL.md index 8541488b..50132e13 100644 --- a/plugins/heygen/skills/heygen-video/SKILL.md +++ b/plugins/heygen/skills/heygen-video/SKILL.md @@ -71,6 +71,10 @@ For *avatar creation* (writing AVATAR files, role symlink maintenance), see the **Pick one transport at session start. Never narrate the choice.** The only allowed cross-transport bridge is local file upload: if the app connector is otherwise selected but the user provides a local file, upload it first with `heygen asset create --file ` or `POST https://api.heygen.com/v3/assets`, then pass the resulting `asset_id` back into the app flow. +### Auth Triage (run immediately) + +Run before assuming app-only execution: `command -v heygen` and `heygen auth status`. If app auth fails but CLI auth is valid, continue in CLI mode for this run. + Detect in this order: 1. **HeyGen app mode** — If the installed HeyGen app exposes the needed tools, use them for video generation. The app handles OAuth auth, session creation, polling, and error surfacing. Frame Check still runs before submission. @@ -78,6 +82,10 @@ Detect in this order: 3. **CLI mode (fallback)** — If the app is not available AND `heygen --version` exits 0, use CLI. Auth via `heygen auth login` (persists to `~/.heygen/credentials`). 4. **Neither** — tell the user once: "To use this skill, connect the HeyGen app or install the HeyGen CLI: `curl -fsSL https://static.heygen.ai/cli/install.sh | bash` then `heygen auth login`." +### Sandbox/Network Note (Codex) + +In Codex desktop/sandboxed runs, CLI calls may fail with DNS/network errors even when auth is valid. Rerun the same command with network approval/escalation. + **Hard rules:** - **Never call general `curl api.heygen.com/...` video/avatar endpoints.** The only direct API exception is `POST https://api.heygen.com/v3/assets` for local file upload when no app upload tool exists. - **HeyGen app mode:** use the app when available. @@ -91,7 +99,9 @@ Use the installed HeyGen app for video generation, avatar discovery, voice listi `heygen video-agent {create,get,send,stop,styles,resources,videos}`, `heygen video {get,list,download,delete}`, `heygen avatar {list,get,consent,create,looks}` (with `heygen avatar looks {list,get,update}`), `heygen voice {list,create,speech}`, `heygen video-translate {create,get,languages}`, `heygen lipsync {create,get}`, `heygen asset create`, `heygen user`, `heygen auth {login,logout,status}`. Every subcommand supports `--help` — that's your reference. Run `heygen --help` to see the full noun list. -**Do not look up API endpoints.** There is no `api-reference.md` lookup step. App mode uses installed tools. CLI mode uses `heygen ... --help`. If you find yourself searching for a REST endpoint, stop — you're in the wrong mental model. +Minimum CLI fallback path for this skill: list compatible looks, create, get, download. Exact commands are in `references/troubleshooting.md`. + +**Do not look up direct video/avatar API endpoints.** App mode uses installed tools. CLI mode uses `heygen ... --help`. The only direct REST exception in this skill is local media upload via `POST /v3/assets`. CLI output: JSON on stdout, `{error:{code,message,hint}}` envelope on stderr, exit codes `0` ok · `1` API · `2` usage · `3` auth · `4` timeout. See [references/troubleshooting.md](references/troubleshooting.md) for error → action mapping and polling cadence. Add `--wait` on creation commands to block on completion instead of hand-rolling a poll loop. @@ -590,6 +600,8 @@ The CLI returns JSON on stdout: `{"data": {"video_id": "...", "session_id": "... Total wall time per video: **20–45 minutes**. If you passed `--wait`, the CLI handles polling with exponential backoff. If polling manually: first check at **5 min**, then every **60s** up to 45 min. +`--wait` can be silent for several minutes; this is normal. + Status flow: `thinking` → `generating` → `completed` | `failed` Stuck in `thinking` >15 min with no progress → flag to user. @@ -598,11 +610,16 @@ Stuck in `thinking` >15 min with no progress → flag to user. 1. Get the `video_url` (S3 mp4) from the completed status response, or use `heygen video get | jq -r '.data.video_page_url'` for the shareable link. 2. Download the MP4 locally: `heygen video download ` (writes the file and emits `{"asset", "message", "path"}` on stdout — chain on `.path`). -3. Send inline via message tool: `message(action:send, media:"", caption:"Your video is ready! 🎬\n📊 Duration: [actual]s vs [target]s ([percentage]%)")`. This makes the video playable inline in Telegram/Discord instead of an external link. -4. Also share the HeyGen dashboard link for editing: `https://app.heygen.com/videos/` +3. Measure real duration before downstream wiring via `ffprobe` (see troubleshooting reference for exact command). +4. Send inline via message tool: `message(action:send, media:"", caption:"Your video is ready! 🎬\n📊 Duration: [actual]s vs [target]s ([percentage]%)")`. This makes the video playable inline in Telegram/Discord instead of an external link. +5. Also share the HeyGen dashboard link for editing: `https://app.heygen.com/videos/` Always report duration accuracy. Clean up downloaded files after sending. +### HyperFrames Handoff + +Use muted `