ContextBuilder is a local graph viewer and context compiler for conversation JSON files. It reads a directory of dialog JSON files, builds a lineage graph of roots, branches, and merges, lets you select a thought direction, and compiles that selection into a Hyperprompt-backed Markdown context artifact ready for an external LLM or agent.
Engineering docs: docs/ARCHITECTURE.md — system design, module breakdown, data flow · docs/PROBLEMS.md — known issues and shortcomings
| Concern | Owner |
|---|---|
| Extracting conversations from browser HTML | ChatGPTDialogs |
| Graph structure, validation, selection, export, compile orchestration | ContextBuilder (this repo) |
| Running LLMs, executing prompts, agent workflows | External — ContextBuilder hands off a compiled Markdown artifact |
| Browser capture, cloud sync, semantic retrieval | Out of scope for v1 |
ContextBuilder does not execute models. Its output is a deterministic Markdown file you hand to an external agent.
Canonicalize imported conversations and start both the API and UI in one shot:
make quickstart DIALOG_DIR=/absolute/path/to/import_json OUTPUT_DIR=/absolute/path/to/canonical_jsonOr start just the server pointing at an existing canonical directory:
make serve DIALOG_DIR=/absolute/path/to/canonical_jsonThen open:
http://localhost:8000/
Or run the server directly (with a custom Hyperprompt binary path):
python3 viewer/server.py --port 8000 --dialog-dir /path/to/dialogs \
--hyperprompt-binary /path/to/hyperpromptmake dev DIALOG_DIR=/path/to/dialogs
# API on :8001, React UI on :5173 → http://localhost:5173/JSON conversations on disk
│
▼
ContextBuilder API GET /api/graph
(graph index + GET /api/conversation
validation) GET /api/checkpoint
│
▼
Select branch target POST /api/export
→ Export lineage nodes → export/{target}/nodes/*.md
→ Generate root.hc → export/{target}/root.hc
│
▼
Compile with Hyperprompt POST /api/compile
│
▼
compiled.md ← hand to external agent
- Graph index — ContextBuilder reads all JSON files in the dialog directory, validates and normalizes them, and builds an in-memory lineage graph.
- Export — You select a conversation or checkpoint as the compile target. ContextBuilder writes each ancestor conversation as a deterministic Markdown file under
export/{target}/nodes/and generatesexport/{target}/root.hcwith exactly one depth-0 root node. Provenance, conversation labels, and node includes are emitted as children under that root in lineage order. - Compile — Hyperprompt reads
root.hc, resolves the included node files, and writesexport/{target}/compiled.md. That file is the final context artifact.
{dialog-dir}/export/{target}/
├── nodes/
│ ├── {conv-id-1}/
│ │ ├── 0000_{msg-id}.md
│ │ ├── 0001_{msg-id}.md
│ │ └── ...
│ └── {conv-id-2}/
│ └── ...
├── provenance.md ← compile-target provenance included in compiled output
├── provenance.json ← machine-readable provenance for traceability/audit
├── root.hc ← Hyperprompt root file
├── compiled.md ← final compiled artifact
└── manifest.json ← compiler manifest (written by Hyperprompt)
Each conversation gets its own subdirectory under nodes/. Each message checkpoint is a separate Markdown file named {index}_{message_id}.md, with a YAML front-matter comment preserving conversation_id, message_id, role, and optional provenance fields.
The export directory is deterministically regenerated on each export. Prior contents are removed.
ContextBuilder calls the Hyperprompt compiler to produce the final Markdown artifact.
cd /path/to/Hyperprompt
swift build -c release
# binary at one of:
# .build/release/hyperprompt
# .build/arm64-apple-macosx/release/hyperpromptpython3 viewer/server.py \
--dialog-dir /path/to/dialogs \
--hyperprompt-binary /path/to/Hyperprompt/.build/release/hyperpromptIf --hyperprompt-binary is not provided (or uses the default value), ContextBuilder resolves Hyperprompt in this order:
.build/release/hyperprompt.build/arm64-apple-macosx/release/hyperprompt.build/x86_64-apple-macosx/release/hyperprompt.build/*/release/hyperprompt(other architecture-specific layouts)deps/hyperpromptinside this repository
If no candidate exists, POST /api/compile returns 422 with compile.checked_paths so you can see exactly which paths were attempted. The export directory remains intact so you can inspect the generated .hc file.
Use this runbook when you need a reproducible path from local JSON conversations to a compiled artifact ready for an external agent.
- Start from canonical JSON conversations (or run
make canonfirst). - Start the server:
make serve DIALOG_DIR=/absolute/path/to/canonical_json
- If you need a custom Hyperprompt binary path, run the server directly:
python3 viewer/server.py \ --port 8000 \ --dialog-dir /absolute/path/to/canonical_json \ --hyperprompt-binary /absolute/path/to/hyperprompt
- Confirm API health by opening
http://localhost:8000/viewer/index.htmland checking that the graph loads.
- In the graph, select a conversation or checkpoint.
- Trigger compile from the checkpoint inspector, or call the API:
curl -sS -X POST http://localhost:8000/api/compile \ -H "Content-Type: application/json" \ -d '{"conversation_id":"<target-conversation-id>","message_id":"<optional-checkpoint-id>"}'
- Record
export_dir,hc_file, andcompile.compiled_mdfrom the response.
For the returned {export_dir}, verify all expected outputs:
nodes/*.mdexists and contains one deterministic file per exported lineage conversation.root.hcexists, has exactly one depth-0 root node, and references node files in lineage order.compiled.mdexists and is non-empty.manifest.jsonexists (when Hyperprompt compile succeeds).
Minimal shell verification:
ls -la "{export_dir}/nodes"
test -s "{export_dir}/root.hc"
test -s "{export_dir}/compiled.md"
test -s "{export_dir}/manifest.json"404fromPOST /api/compile: targetconversation_idormessage_iddoes not exist.409fromPOST /api/compile: target is blocked by integrity errors; inspect graph diagnostics and fix lineage/schema issues first.422withcompile.error: Hyperprompt binary missing or compile-time failure; inspectcompile.checked_paths, verify--hyperprompt-binarypath (if set), and rerun.
When compile fails, inspect root.hc and exported node Markdown files in {export_dir} before retrying.
Before handing off to an external LLM/agent, package:
compiled.md(primary context artifact).manifest.json(provenance and compile metadata).- Target metadata used for compilation (
conversation_id, optionalmessage_id, and timestamp). - Optional:
root.hcandnodes/for auditability/debugging.
Recommended handoff note template:
Context artifact: /absolute/path/to/export/<target>/compiled.md
Manifest: /absolute/path/to/export/<target>/manifest.json
Compile target: conversation_id=<id>, message_id=<id-or-none>
Compiled at: YYYY-MM-DDTHH:MM:SSZ
The viewer opens in a graph-first mode:
- The main canvas is driven by
GET /api/graphand renders workspace lineage. - Root, branch, and merge conversations have distinct node states.
- Broken lineage is visible as broken edges and warning states, not silently hidden.
- Drag the canvas background to pan; click a node to open its transcript and inspector.
- The inspector shows lineage edges, integrity details, and a checkpoint inventory.
- Click
Inspect checkpointon a message to focus that checkpoint and review its workflow metadata. - Branching from the active checkpoint is available from the checkpoint inspector.
- Compile actions are accessible from the checkpoint inspector once a target is selected.
- The sidebar file list remains available for file-level actions (open, delete, save).
Files produced by ChatGPTDialogs (or any compatible exporter):
{
"title": "Conversation Title",
"source_file": "/absolute/path/to/file.html",
"message_count": 42,
"messages": [
{ "role": "user", "content": "..." },
{ "role": "assistant", "content": "..." }
]
}Imported files may omit conversation_id and lineage metadata. ContextBuilder normalizes them into canonical roots on first save.
Canonical conversations created or normalized by ContextBuilder:
{
"conversation_id": "conv-trust-social-root",
"title": "Trust Social Root Conversation",
"source_file": "/absolute/path/to/file.html",
"messages": [
{
"message_id": "msg-root-1",
"role": "user",
"content": "Outline the concept.",
"turn_id": "turn-root-1",
"source": "conversation-turn-1"
}
],
"lineage": {
"parents": []
}
}message_idis the canonical anchor for graph lineage and export artifacts.turn_idandsourceare preserved imported provenance when available.
{
"conversation_id": "conv-branding-branch",
"title": "Branding Branch",
"messages": [
{ "message_id": "msg-branch-1", "role": "user", "content": "Continue from the protocol naming checkpoint." }
],
"lineage": {
"parents": [
{
"conversation_id": "conv-trust-social-root",
"message_id": "msg-root-2",
"link_type": "branch"
}
]
}
}{
"conversation_id": "conv-contextbuilder-merge",
"title": "Context Compiler Merge",
"messages": [
{ "message_id": "msg-merge-1", "role": "user", "content": "Combine graph selection with Hyperprompt compilation." }
],
"lineage": {
"parents": [
{ "conversation_id": "conv-trust-social-root", "message_id": "msg-root-2", "link_type": "merge" },
{ "conversation_id": "conv-branding-branch", "message_id": "msg-branch-2", "link_type": "merge" }
]
}
}- Imported roots are valid inputs even when
conversation_idis absent. - Canonical files created by ContextBuilder must include
conversation_id. lineage.parentsis mandatory for canonical files; empty only for canonical roots.- Parent references always require
conversation_id,message_id, andlink_type. link_typeisbranchfor single-parent conversations andmergefor multi-parent.- ContextBuilder preserves imported provenance (
source_file,turn_id,source) in exports.
- Already-canonical payloads are preserved as-is.
- Imported roots with stable message identity get a deterministic
conversation_idandlineage: { parents: [] }added on save. conversation_idis derived fromsource_file,title, and orderedmessage_idvalues.- Payloads missing stable message identity or with inconsistent
message_countare rejected.
ContextBuilder validates individual payloads and whole-workspace lineage before persisting or graph-indexing any conversation:
- Canonical conversations require a non-empty
conversation_id, valid messages with stablemessage_idvalues, and alineage.parentslist. - Duplicate
message_idvalues within one conversation are rejected. - Parent references must include
conversation_id,message_id, andlink_type. - Single-parent conversations use
branchlinks; multi-parent usemergelinks. - Duplicate parent references are rejected.
- Workspace validation surfaces: duplicate
conversation_idacross files, missing parent conversations, missing parentmessage_idvalues, and invalid JSON.
| Endpoint | Description |
|---|---|
GET /api/files |
Workspace listing with per-file validation metadata and graph snapshot |
GET /api/graph |
Full graph snapshot, summary counts, and integrity split (blocking / non-blocking) |
GET /api/file?name=... |
One file payload with its validation result |
GET /api/conversation?conversation_id=... |
One graph node with edges, integrity, and compile target metadata |
GET /api/checkpoint?conversation_id=...&message_id=... |
One checkpoint anchor with child edges and compile target metadata |
GET /api/capabilities |
Feature flags: spec_graph (true when --spec-dir is set) |
GET /api/spec-graph |
SpecGraph snapshot: nodes, edges, roots, gap metrics (requires --spec-dir) |
GET /api/spec-node?id=... |
Full YAML payload for one spec node (requires --spec-dir) |
GET /api/spec-watch |
SSE stream — fires a change event when YAML spec files change on disk |
| Endpoint | Body | Description |
|---|---|---|
POST /api/file |
{ name, data, overwrite? } |
Validate and save a conversation file |
DELETE /api/file?name=... |
— | Delete a conversation file |
| Endpoint | Body | Description |
|---|---|---|
POST /api/export |
{ conversation_id, message_id? } |
Export lineage nodes to disk and generate root.hc |
POST /api/compile |
{ conversation_id, message_id? } |
Export lineage nodes and invoke the Hyperprompt compiler |
Export response fields:
export_dir— absolute path to the export directoryhc_file— absolute path toroot.hcprovenance_md— absolute path toprovenance.mdprovenance_json— absolute path toprovenance.jsonnode_count— total number of exported node markdown filesconversations— exported conversations with node directories and file listscompile_target— the compile target metadata used
Compile response fields (in addition to export fields):
compile.compiled_md— absolute path tocompiled.mdcompile.manifest_json— absolute path tomanifest.jsoncompile.provenance_md— absolute path toprovenance.mdcompile.provenance_json— absolute path toprovenance.jsoncompile.error— present only on failure, with a human-readable description
- Unknown
conversation_id→404 - Conversation blocked by integrity errors →
409with diagnostics - Unknown checkpoint anchor →
404 - Hyperprompt binary missing or compile failure →
422withcompile.error
GET /api/graph returns:
graph.nodes— graph-safe conversations keyed byconversation_idgraph.edges— resolved or broken parent links between checkpoints and child conversationsgraph.roots— canonical rootconversation_idvalues with no parentsgraph.blocked_files— files excluded from the graph due to validation errorsgraph.diagnostics— aggregate counts for nodes, edges, roots, blocked files, and issuessummary— counts for nodes, edges, roots, blocked files, total diagnostics, blocking issues, non-blocking issuesintegrity— diagnostics split intoblockingandnon_blocking
Each graph node includes:
conversation_id,file_name,kind(root / branch / merge),title,source_filecheckpoints— ordered messages, each withmessage_id, metadata, and outbound child edge IDsparent_edge_idsandchild_edge_ids- node-level diagnostics for broken parent references
Each graph edge includes:
- parent conversation and
message_id - child conversation and file
link_type(branchormerge)status(resolvedorbroken)- edge-specific diagnostics
Returned by GET /api/conversation and GET /api/checkpoint:
| Field | Description |
|---|---|
scope |
conversation or checkpoint |
target_conversation_id |
Selected conversation |
target_message_id |
Selected message anchor (checkpoint scope only) |
lineage_conversation_ids |
Deterministic ancestor ordering ending at target |
lineage_edge_ids |
Parent edges included in the ancestry |
lineage_paths |
Root-to-target paths, preserving merge provenance |
root_conversation_ids |
True graph roots reachable from the target; empty when is_lineage_complete is false |
merge_parent_conversation_ids |
Merge parents visible in ancestry |
unresolved_parent_edge_ids |
Broken parent edges making lineage incomplete |
is_lineage_complete |
false when any ancestor edge is unresolved |
ContextBuilder/
├── viewer/
│ ├── server.py # HTTP server: routing, graph indexing, export, compile
│ ├── schema.py # Schema validation, normalization, and classification
│ ├── canonicalize.py # Batch import normalization CLI
│ ├── specgraph.py # SpecGraph YAML ingestion and graph construction
│ └── app/ # React + TypeScript UI (Vite)
│ ├── src/ # React components, graph canvas, data hooks
│ └── dist/ # Built UI assets (served by server.py)
├── tests/ # Python unit and integration tests
├── real_examples/ # Example dialog JSON fixtures
├── docs/ # Engineering documentation
│ ├── ARCHITECTURE.md # System architecture (top-level → detail)
│ ├── PROBLEMS.md # Known issues and shortcomings
│ ├── CANONICALIZATION.md
│ └── QUICKSTART.md
├── Makefile # Developer entrypoints
└── README.md
| Target | Purpose |
|---|---|
make quickstart |
Canonicalize + start API + start UI in one shot |
make serve DIALOG_DIR=... |
Start the combined API + static file server (port 8000) |
make dev DIALOG_DIR=... |
Start API on :8001 and React dev server on :5173 |
make api |
Start API only (uses $CANONICAL_DIR default) |
make ui |
Start React dev server only |
make specspace-restart |
Restart SpecSpace API + SpecSpace dev UI in detached screen sessions |
make specspace-status |
Show SpecSpace dev screen sessions and port listeners |
make canonicalize DIALOG_DIR=... OUTPUT_DIR=... |
Batch-normalize imported JSON to canonical form |
make canon |
Canonicalize with default input/output paths |
make test |
Run all Python tests |
make lint |
Syntax-check Python source files |
make stop |
Kill servers running on ports 8001 and 5173 |
- Python 3.11+
- Node.js 18+ (for the React UI)
- Hyperprompt compiler binary (for compile tests and runtime compile)
make testmake lintSee docs/ARCHITECTURE.md for a full top-down description of the system: data model, backend modules, frontend components, and the export/compile pipeline.
Quick orientation:
viewer/schema.py— canonical schema rules and validation helpers used by the server and tests.viewer/canonicalize.py— batch normalization for imported root JSON files.viewer/specgraph.py— YAML spec node ingestion and SpecGraph construction.viewer/server.py— all HTTP handlers, graph indexing, export, and compile orchestration.viewer/app/src/— React graph canvas, inspector panels, compile affordances.tests/— unit tests for schema, graph indexing, API handlers, export pipeline, and compile integration.
- Add schema rules to
viewer/schema.py. - Update
build_graph_snapshotinviewer/server.pyto handle the new kind. - Add fixture files under
real_examples/ortests/fixtures/. - Add regression tests in
tests/.