Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 6 additions & 4 deletions plugins/heygen/.codex-plugin/plugin.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"name": "heygen",
"version": "2.2.0",
"description": "Create HeyGen avatar videos and personalized video messages. Build a persistent digital identity from a photo, then generate presenter-led videos with your digital twin.",
"version": "2.2.1",
"description": "Create HeyGen avatar videos and personalized video messages. Build a persistent digital identity from a description or hosted photo URL, then generate presenter-led videos with your digital twin.",
"author": {
"name": "HeyGen",
"email": "developers@heygen.com",
Expand All @@ -27,13 +27,15 @@
"interface": {
"displayName": "HeyGen",
"shortDescription": "Avatar videos and personalized video messages",
"longDescription": "HeyGen Skills give your agent a face, a voice, and the ability to send video like a message. Use heygen-avatar to build a persistent digital identity from a photo and pick a voice, then heygen-video to generate identity-first presenter videos via the HeyGen v3 Video Agent pipeline (avatar resolution, aspect ratio correction, prompt engineering, and voice selection are handled automatically).",
"longDescription": "HeyGen Skills give your agent a face, a voice, and the ability to send video like a message. Use heygen-avatar to build a persistent digital identity from a written description or hosted photo URL and pick a voice, then heygen-video to generate identity-first presenter videos via the HeyGen v3 Video Agent pipeline (avatar resolution, aspect ratio correction, prompt engineering, and voice selection are handled automatically).",
"developerName": "HeyGen",
"category": "Design",
"capabilities": ["Read", "Write"],
"websiteURL": "https://heygen.com",
"privacyPolicyURL": "https://www.heygen.com/privacy",
"termsOfServiceURL": "https://www.heygen.com/terms",
"defaultPrompt": [
"Create my HeyGen avatar from this photo",
"Create my HeyGen avatar from a written description",
"Make a 30-second intro video of myself",
"Send a video update to my team about this week's progress"
],
Expand Down
6 changes: 5 additions & 1 deletion plugins/heygen/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,16 +6,20 @@ OpenAI Codex plugin for [HeyGen](https://heygen.com) — create AI avatar videos

Two skills that chain together:

- **heygen-avatar** — turn a photo into a persistent digital twin. Handles avatar lookup, instant-avatar creation, voice selection (or voice cloning), and writes an `AVATAR` file the video skill reads back.
- **heygen-avatar** — create a persistent digital twin from a written description or a hosted photo URL. Handles avatar lookup, avatar creation, voice selection (or voice cloning), and writes an `AVATAR` file the video skill reads back.
- **heygen-video** — generate identity-first presenter videos via the HeyGen v3 Video Agent pipeline. Encodes the prompting, asset routing, aspect-ratio correction, and avatar/voice resolution that good HeyGen videos need.
- **HeyGen app reference** — `.app.json` points at the curated [HeyGen ChatGPT app](https://chatgpt.com/apps/heygen/asdk_app_69418aad55e08191aa5e437b649ca2e4).

## Requirements

Installing the plugin connects the HeyGen ChatGPT app automatically (OAuth on first use). That is enough for the skills to work end-to-end on the user's existing HeyGen plan credits.

If browser auth succeeds but chat still shows `Authenticate` and does not advance, this is usually a connector/session state issue. Start a new chat session and reconnect the app.

If you'd rather not use the app, the skills also support the HeyGen CLI: install it from <https://static.heygen.ai/cli/install.sh> and export `HEYGEN_API_KEY` (get one at <https://app.heygen.com/api>).

Local file upload note: the current HeyGen app connector accepts hosted HTTPS media URLs or existing HeyGen `asset_id` values for avatar/photo creation. It does not upload local `file://` paths directly. For local photos or videos, upload first with `heygen asset create --file <path>` or `POST https://api.heygen.com/v3/assets` using `multipart/form-data`, then pass the returned `asset_id` into the app or CLI creation flow.

## Source of truth

The skills are authored in [`heygen-com/skills`](https://github.com/heygen-com/skills) (under `heygen-avatar/` and `heygen-video/` at the repo root) and mirrored here. The main structural delta in this mirror is the wrapping `skills/` parent directory required by the Codex plugin convention. File issues about skill content on that repo.
Expand Down
34 changes: 26 additions & 8 deletions plugins/heygen/skills/heygen-avatar/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ This skill reads and writes the following. No other files are accessed without e
| Write | `AVATAR-<NAME>.md` | Save new avatar identity after creation |
| Write | `AVATAR-AGENT.md`, `AVATAR-USER.md` (symlinks) | Role aliases, see Phase 5 |
| Temp write | `/tmp/heygen/uploads/` | Voice preview audio (downloaded for user playback, deleted after session) |
| Remote upload | HeyGen (via the app or `heygen asset create`) | User-provided photos uploaded to HeyGen for digital-twin creation |
| Remote upload | HeyGen (via CLI/API asset upload) | Local photos/videos uploaded to HeyGen before avatar creation |

Assets are only uploaded to HeyGen when the user explicitly provides them.

Expand Down Expand Up @@ -86,9 +86,15 @@ Try to read `SOUL.md` from the workspace root.

**HeyGen app (preferred):** If the HeyGen app is available through the installed app integration, use it. The app authenticates via OAuth and runs against the user's existing HeyGen plan credits.

**Auth triage (run immediately):** run `command -v heygen` and `heygen auth status`. If app/MCP auth fails but CLI auth is valid, continue in CLI mode for this run.

**Local media limitation:** The current HeyGen app connector does NOT upload local files. Its photo/video/avatar tools accept only hosted HTTPS URLs or existing HeyGen `asset_id` values. Never pass `file://`, absolute local paths, or Codex attachment paths directly to app tools. For local photos/videos, first upload with `heygen asset create --file <path>` or `POST https://api.heygen.com/v3/assets` using `multipart/form-data`; then pass `{ "type": "asset_id", "asset_id": "..." }` to the app or CLI creation call. If no CLI/API upload path is available, ask the user for an HTTPS image URL or permission to continue with prompt-only creation.

**CLI fallback:** If the app is not available AND the `heygen` binary runs cleanly (`heygen --version` exits 0), use it. Auth: `HEYGEN_API_KEY` env OR `heygen auth login` (persists to `~/.heygen/credentials`). If the CLI is missing, errors on `--version`, or auth is not set, silently skip it.

**Neither available:** Only if the app is unavailable AND the CLI doesn't work, tell the user once: "To use this skill, connect the HeyGen app or install the HeyGen CLI: `curl -fsSL https://static.heygen.ai/cli/install.sh | bash` then `heygen auth login`."
In Codex desktop/sandboxed runs, CLI network calls may fail with DNS/network errors until network approval/escalation is granted. Retry with escalation when needed.

**Neither available:** Only if the app is unavailable AND the CLI doesn't work, tell the user once: "To use this skill, connect the HeyGen app or install the HeyGen CLI: `curl -fsSL https://static.heygen.ai/cli/install.sh | bash` then `heygen auth login`." If the only missing capability is local media upload, say that local photos need an HTTPS URL or a CLI/API asset upload first.

**API:** v3 only. Never call v1 or v2 endpoints.

Expand Down Expand Up @@ -229,7 +235,9 @@ Only run this step when Phase 0 target = **user** (real-person digital twin) OR
- Otherwise, ask one sentence: *"Got a headshot? It gives better face consistency for videos of you. I can also generate from your description — just say 'skip.'"*

Branch:
- **Photo provided** → upload via the HeyGen app or `heygen asset create --file <path>`, then Type B (photo) creation in Phase 2.
- **Photo provided as local file/path** → upload via `heygen asset create --file <path>` or `POST https://api.heygen.com/v3/assets`, then Type B (photo) creation with the returned `asset_id`.
- **Photo provided as HTTPS URL or asset_id** → Type B (photo) creation in Phase 2.
- **Local photo but no upload path available** → ask for an HTTPS image URL or offer prompt-only creation. Do not pass the local path into the app connector.
- **Skip** → Type A (prompt) creation in Phase 2.

For agents and named characters, skip this entire step — go straight to Type A (prompt) creation.
Expand Down Expand Up @@ -258,15 +266,25 @@ Prompt limit is 1000 characters. Be descriptive — include style, features, exp

**Type B — From reference image:**

**App:** use the HeyGen app flow for photo avatar creation.
**CLI:** `heygen avatar create -d '{"type":"photo","name":"...","file":{"type":"url","url":"..."},"avatar_group_id":"..."}'`
**App:** use the HeyGen app flow for photo avatar creation only with an HTTPS URL or pre-uploaded `asset_id`.
**CLI:** `heygen avatar create -d '{"type":"photo","name":"...","file":{"type":"asset_id","asset_id":"..."},"avatar_group_id":"..."}'`

File options for Type B:
- `{ "type": "url", "url": "https://..." }` — public image URL
- `{ "type": "asset_id", "asset_id": "<id>" }` — from `heygen asset create --file <path>`
- `{ "type": "base64", "media_type": "image/png", "data": "<base64>" }` — inline

📖 **When to use each (URL vs asset_id vs base64), upload routing, and edge cases → [references/asset-routing.md](references/asset-routing.md)**
Do not pass local paths or `file://` URLs to the app connector. Upload local files to an `asset_id` first.

Raw API upload example:
```bash
ASSET_ID=$(curl -s -X POST "https://api.heygen.com/v3/assets" \
-H "X-Api-Key: $HEYGEN_API_KEY" \
-F "file=@/path/to/headshot.jpg" | jq -r '.data.asset_id')
```

The v3 upload endpoint accepts `multipart/form-data`, auto-detects MIME type from file bytes, and returns `data.asset_id`.

📖 **When to use each (URL vs asset_id), upload routing, and edge cases → [references/asset-routing.md](references/asset-routing.md)**

**Response:** Returns `avatar_item.id` (look ID) and `avatar_item.group_id` (character identity).

Expand Down Expand Up @@ -446,7 +464,7 @@ simply `cat AVATAR-AGENT.md` and get whatever the current agent's avatar is.
- Missing SOUL.md/IDENTITY.md → conversational onboarding, write AVATAR file from answers
- API fails → retry once, then ask user to check API key
- Voice match poor → show all available voices, let user browse
- Asset upload fails → skip reference image, try prompt-only creation
- Asset upload unavailable or fails → ask for an HTTPS URL or skip reference image and try prompt-only creation
- Existing avatar file with stale HeyGen IDs → offer to regenerate or keep

📖 **Known issues, retry patterns, broken voice previews, error → action mapping → [references/troubleshooting.md](references/troubleshooting.md)**
2 changes: 1 addition & 1 deletion plugins/heygen/skills/heygen-avatar/agents/openai.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
interface:
display_name: "HeyGen Avatar"
short_description: "Create reusable HeyGen avatar identities"
default_prompt: "Create a reusable HeyGen avatar for me from a photo or written description, then help me choose a matching voice."
default_prompt: "Create a reusable HeyGen avatar for me from a written description, then help me choose a matching voice."
24 changes: 18 additions & 6 deletions plugins/heygen/skills/heygen-avatar/references/asset-routing.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ When the user provides files, URLs, or references, route each asset to the right
| Path | What happens | When to use |
|------|-------------|-------------|
| **A: Contextualize → Prompt** | Read/analyze the asset, extract key info, bake into script. Video Agent never sees the original. | Reference material, auth-walled content, documents where the *information* matters more than the *visual*. |
| **B: Attach to API** | Upload the raw file via `files[]`. Video Agent analyzes, extracts graphics, uses as frames/B-roll. | Screenshots, branded assets, PDFs with important visual layouts, images the viewer should literally see. |
| **B: Attach to API** | Attach a file reference via `files[]` (`asset_id` or HTTPS URL). Video Agent analyzes, extracts graphics, uses as frames/B-roll. | Screenshots, branded assets, PDFs with important visual layouts, images the viewer should literally see. |
| **A+B: Both** | Contextualize for script quality AND attach for visual use. | Long docs where you need to summarize but Video Agent should also have the full source. |

## Classification Flow
Expand Down Expand Up @@ -50,20 +50,31 @@ When the user provides files, URLs, or references, route each asset to the right
- Weave naturally into the script. Don't dump. Integrate.

### Path B (Attach)
Upload to HeyGen:
Upload local files to HeyGen before passing them to avatar or video tools:

**App:** upload through the HeyGen app's asset flow when available.
**CLI:** `heygen asset create --file /path/to/file.png`
**Important:** the current HeyGen app connector does not upload local files. It accepts hosted HTTPS URLs or existing HeyGen `asset_id` values. Never pass `file://`, absolute local paths, or Codex attachment paths directly to app tools.

**CLI/API:** `heygen asset create --file /path/to/file.png` or `POST https://api.heygen.com/v3/assets`

Max 32MB per file. Returns JSON with the new `asset_id`.

Or pass inline in `files[]`:
Raw API upload:
```bash
ASSET_ID=$(curl -s -X POST "https://api.heygen.com/v3/assets" \
-H "X-Api-Key: $HEYGEN_API_KEY" \
-F "file=@/path/to/file.png" | jq -r '.data.asset_id')
```

`POST /v3/assets` uses `multipart/form-data`, auto-detects MIME type from file bytes, and returns `data.asset_id`.

Then pass one of these media references:
```json
{"type": "url", "url": "https://example.com/image.png"}
{"type": "asset_id", "asset_id": "<from upload>"}
{"type": "base64", "data": "<base64>", "content_type": "image/png"}
```

If a local file is provided and no CLI/API upload path is available, ask the user for an HTTPS URL or continue without the reference image. Do not retry with the raw local path.

### Describe Asset Usage in Prompt
Be SPECIFIC:
- "Use the uploaded dashboard screenshot as B-roll when discussing analytics"
Expand All @@ -84,3 +95,4 @@ In the learning log entry, record:
- **URLs that fail:** Try the environment's standard web/content fetch capability. If login/paywall/404 → tell the user, ask for content directly. Never silently fabricate.
- **HTML URLs cannot go in `files[]`.** Video Agent rejects `text/html`. Web pages are ALWAYS Path A only.
- **Prefer download→upload→asset_id** over `files[]{url}`. HeyGen's servers often blocked by CDN/WAF.
- **Local paths must become asset IDs first.** App tools reject local file references.
Original file line number Diff line number Diff line change
Expand Up @@ -32,14 +32,16 @@ change). Only use Mode 1 (new character) for genuinely new identities.

### Photo avatar (from user's photo)

**App:** use the HeyGen app flow for photo avatar creation.
**App:** use the HeyGen app flow for photo avatar creation only when the photo is a hosted HTTPS URL or an existing HeyGen `asset_id`. The app connector does not upload local paths.

**Local file:** first run `heygen asset create --file <path>` or `POST https://api.heygen.com/v3/assets`, then use the returned `asset_id`.

**CLI:**
```bash
heygen avatar create -d '{
"type": "photo",
"name": "My Avatar",
"file": {"type": "url", "url": "https://example.com/headshot.jpg"},
"file": {"type": "asset_id", "asset_id": "<uploaded_asset_id>"},
"avatar_group_id": "<optional>"
}'
```
Expand Down Expand Up @@ -72,7 +74,7 @@ Optional: up to 3 `reference_images` to anchor the generated appearance.

### Video avatar / digital twin (from a short recording)

**App:** use the HeyGen app flow for digital-twin creation from video.
**App:** use the HeyGen app flow for digital-twin creation from video only when the video is a hosted HTTPS URL or an existing HeyGen `asset_id`. Upload local recordings to `asset_id` first.

**CLI:**
```bash
Expand All @@ -88,19 +90,18 @@ heygen avatar create -d '{

## File Input Formats

`file` accepts three forms:
`file` accepts these app-safe forms:

```jsonc
// Public URL (no auth, no paywall)
{ "type": "url", "url": "https://example.com/headshot.jpg" }

// Pre-uploaded asset (from `heygen asset create --file <path>`)
{ "type": "asset_id", "asset_id": "<id>" }

// Inline base64
{ "type": "base64", "data": "<base64>", "content_type": "image/png" }
```

Do not pass local paths or `file://` URLs to the app connector. The broader API/CLI may support additional encodings, but local files should be converted to `asset_id` first for this plugin flow.

For when each is appropriate, see
[`references/asset-routing.md`](asset-routing.md).

Expand Down
55 changes: 55 additions & 0 deletions plugins/heygen/skills/heygen-avatar/references/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,61 @@ Video Agent rejects `text/html` content type in the `files[]` array. Web pages (

---

## Local File Paths Rejected by App Connector

**Symptom:** Photo/avatar creation fails with an error saying the connector rejected a local photo path or only accepts HTTPS image URLs / existing HeyGen `asset_id` values.

**Root Cause:** The current HeyGen app connector does not expose asset upload. It cannot consume `file://`, absolute local paths, or Codex attachment paths directly.

**Fix:** Upload the local file with `heygen asset create --file <path>` or `POST https://api.heygen.com/v3/assets`, then call the app/CLI creation flow with `{ "type": "asset_id", "asset_id": "<uploaded_asset_id>" }`. If upload is unavailable, ask for an HTTPS URL or continue with prompt-only creation.

---

## App Auth Broken, CLI Auth Works

**Symptom:** App/MCP calls fail with token invalid/expired errors, while CLI commands work on the same machine.

**Fix:** Run:
```bash
command -v heygen
heygen auth status
```
If CLI auth is valid, continue in CLI mode for the current run.

---

## Authenticate Button Loop After Browser Success

**Symptom:** User completes browser auth successfully, returns to Codex, but chat still shows `Authenticate` and repeated clicks do not resolve.

**Root Cause:** Connector/session state in the current chat did not refresh after OAuth callback.

**Fix:** Start a new chat session and reconnect the HeyGen app. Then rerun:
```bash
command -v heygen
heygen auth status
```

---

## Sandbox DNS/Network Failures in Codex

**Symptom:** CLI commands fail with DNS/network errors despite valid auth.

**Root Cause:** Network-restricted sandbox execution.

**Fix:** Rerun the same command with network approval/escalation.

---

## CLI Telemetry Noise in Sandboxed Runs

**Symptom:** Analytics/telemetry DNS warnings (for example PostHog) clutter command output.

**Fix:** If supported by the installed CLI version, disable analytics for agent runs to reduce noise. If not supported, ignore telemetry warnings unless command exit status indicates failure.

---

## Avatar Not Ready for Video Generation

**Symptom:** Video generation fails or produces errors immediately after creating a new avatar. The avatar exists in the HeyGen dashboard but videos referencing it fail.
Expand Down
Loading