[Hackathon] feat: AI workflow documentation tool with history, editing, and diff by zyratlo · Pull Request #5088 · apache/texera

zyratlo · 2026-05-16T01:02:32Z

Demo Video

https://drive.google.com/file/d/1PcP_N_hucajoNYRuUK9bxhqcg7jfvic9/view?usp=sharing

What changes were proposed in this PR?

Adds a new AI Workflow Documentation tool to the workspace, accessible from the existing menu bar via the document-text icon. The tool lives in a right-side nz-drawer that doesn't block the canvas, and supports three main flows: generating documentation with an LLM, writing it manually, and managing a per-workflow history of reports.

Generating a report

Click Generate Documentation on the intro page. The backend agent-service calls the LLM with a structured context of the current workflow (operator IDs, display names, types, descriptions, properties, UDF source code, compiled output schemas, link graph, and author comment-box notes) and a prompt instructing the model to produce sections for Purpose / Inputs / Pipeline Stages / Outputs / Caveats.
The prompt requires the LLM to format every operator mention as a markdown link of the form Display Name. Clicking one of these links in the rendered doc pans the canvas to center the operator and highlights it.
A centered spinner and stopwatch gives feedback during the wait. A Cancel button unsubscribes the request and cleans up the temp agent via the service's finalize handler.

Writing your own

Write Your Own opens the doc view in edit mode with a blank text area. Saving creates a new history entry tagged written: true.
An Edit button on any existing entry swaps the rendered markdown for a text area pre-filled with the current content. Saving updates the entry in place, marks it edited: true, refreshes its timestamp, and moves it to the top of the history. Source labels in the timestamp adjust accordingly ("Generated" → blue, "Updated" → blue + orange edited pill, "Written" → purple).
Both the editor and any in-progress draft are persisted, so closing the drawer mid-edit and reopening it lands the user back in the editor with their unsaved text intact. Navigating away from the entry (Back, viewing another report) pops a "Discard unsaved changes?" confirm.

Per-workflow history

Every generated/written/edited entry is recorded in a list on the intro page. Rows show the title (or timestamp fallback), a "Latest" badge on the newest, plus icons that mirror the report's origin (blue thunderbolt for AI, purple edit pencil for hand-written).
Each row supports Rename (inline editable title field, with the rename also exposed inside the doc view's header), Duplicate (carries title with a (copy) suffix), and Delete (with popconfirm). The list scrolls internally so long histories don't blow out the drawer.
A Compare mode lets the user pick any two entries and opens a side-by-side diff. The diff is line-level via diff.diffLines, but each (removed, added) pair runs through diffWordsWithSpace so only the actually-changed words light up — entirely rewritten lines are still uniformly colored. The compare modal shows entry titles (or "Base"/"Head") plus −N / +M totals.

Persistence

History, last-viewed-page-per-workflow, and the in-progress editing state are all serialized to localStorage under a single versioned key (texera.workflowDoc.v1). The constructor revives Date fields on load; every mutation writes back. DocEntry gets a stable id (crypto.randomUUID() with a timestamp fallback) so the editing-state restore can match entries across JSON round-trips. Quota / availability failures degrade silently to the previous in-memory behavior.

Files

New:
- frontend/src/app/workspace/service/workflow-doc/workflow-doc.service.ts
- frontend/src/app/workspace/component/workflow-doc-panel/
- frontend/src/app/workspace/component/workflow-doc-diff/
- agent-service documentation prompt + endpoint.
Modified:
- agent-service/src/agent/texera-agent.ts
- agent-service/src/server.ts
- frontend/src/app/workspace/component/menu/{menu.component.ts,menu.component.html}
- frontend/src/app/workspace/component/workflow-editor/workflow-editor.component.ts
- frontend/src/app/workspace/service/workflow-graph/model/workflow-graph.ts
- frontend/src/app/app.module.ts (registers NzDrawerModule)
- frontend/package.json (adds diff and file-saver types where needed).

Any related issues, documentation, discussions?

See Discussion #5059

Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Anthropic Claude Opus 4.7)

Future Work

Bidirectional report ↔ workflow editing — keep the report and the canvas in sync as both sides change. Editing an operator's description in the report could propagate back to the operator's properties on the canvas; adding or renaming an operator on the canvas could update the corresponding section in the latest report.
Include runtime information in the doc — augment the LLM context with execution metadata (per-operator runtime stats, row counts, errors, last execution time) so generated reports describe not just what the pipeline does structurally but how it actually behaves on the current data.
Richer compare view — collapse unchanged sections with git-style "show N unchanged lines" expanders; make texera:op: links clickable inside the diff to reuse the canvas pan/highlight flow; add per-section change summaries ("Inputs: unchanged, Pipeline Stages: 3 lines changed"); offer a unified-vs-side-by-side toggle and an "ignore whitespace" mode.
Floating, draggable, resizable panel — replace the right-side drawer with a movable, resizable window, so the tool no longer obscures operator details, the run button, or other right-side UI.

Adds a one-click "Document workflow" action to the workflow editor menu. Clicking the button sends the current workflow content to agent-service, which makes a single LLM call using WORKFLOW_DOCUMENTATION_PROMPT and returns structured markdown covering purpose, inputs, pipeline stages, outputs, and caveats. The result renders in a modal panel with a copy-to-clipboard action. Backend: new documentWorkflow() method on TexeraAgent, POST /agents/:id/document-workflow endpoint, and prompt in prompts.ts. Frontend: WorkflowDocService (creates a temporary agent, calls the endpoint, cleans up), WorkflowDocPanelComponent (renders markdown via MarkdownService), and a menu button that drives the flow. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…data, schemas, and author notes Replaces the raw JSON prompt input with a structured context assembled by buildDocumentationContext(). For each operator the context now includes: the human-readable type description from WorkflowSystemMetadata, inlined Python/R UDF code, and per-port output schemas from compileWorkflowAsync(). Data-flow links are expressed using display names. Comment boxes are included as author notes. Compilation failures are handled gracefully. Updates WORKFLOW_DOCUMENTATION_PROMPT to expect the richer structured input and instructs the model to use schema and code information when describing inputs, outputs, and pipeline stages. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

First click on the document button opens cached documentation instantly if it exists; subsequent opens skip the LLM call. The modal now shows the generation timestamp and a Re-generate button that reruns the LLM in-place without closing and reopening the dialog. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Make Re-generate the primary call-to-action and demote Copy Markdown to a secondary outlined button. Move the generation timestamp to the left of the header and add a divider so the actions read as a toolbar above the content. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Clicking the document button now opens an intro view that explains what the tool produces before any LLM call is made. The intro exposes "Generate Documentation" (or "Generate New" + "View Latest" when a cached doc exists) and transitions to the doc view, which keeps the existing copy and re-generate actions and adds a back button to return to the intro. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Add a click-to-navigate flow from the AI-generated workflow doc to the canvas. The LLM now emits operator references as markdown links of the form [Display Name](texera:op:OPERATOR_ID); the doc panel intercepts clicks on those links, validates the ID against the live graph, and fires a new centerOnOperator event on WorkflowGraph. WorkflowEditor subscribes, pans the JointJS paper to center the target operator, and highlights it using the existing highlight stream. The agent context now exposes each operator's ID alongside its display name so the model can produce valid links, and operator-reference links get a distinctive pill-style render so users notice they are clickable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The Copy and Re-generate buttons are meaningless before any markdown exists, and showing the Re-generate spinner during the initial run was confusing because the user clicked "Generate", not "Re-generate". Hide the action group entirely until rawMarkdown is populated; the centered loading spinner is the only progress indicator on the first run. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replace the centered modal with a 520px-wide drawer (NzDrawerService, nzMask: false) so the canvas stays visible and interactive while the documentation is open. Clicking an operator link in the report now pans the canvas and highlights the operator without the user having to close anything first. Side changes wired in to support drawer hosting: - WorkflowDocService now persists the last-viewed panel (intro vs doc) keyed by workflow ID, so reopening the drawer returns to whichever view the user was on when they closed it. - WorkflowDocPanelComponent reads its initial view from the modal data and routes every view transition through a setView helper that fires an onViewChange callback back to the service. - Inject ChangeDetectorRef and call detectChanges() after async state updates: the drawer renders content through a CDK overlay portal which does not propagate Angular's default change detection the way the modal did, so the spinner was previously stuck on success. - Drop the hard-coded max-height: 60vh on .doc-content and switch the panel to flex layout so it fills the drawer body instead of leaving a dead band of whitespace below the content. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

LLM doc generation can take 30+ seconds and previously the only feedback was an indeterminate spinner. Add a m:ss elapsed-time pill under the spinner that ticks every second so the user knows the request is still in flight. The timer starts on generate() and stops on success or failure; uses tabular-nums so the digits do not jitter. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replace the single-entry doc cache with a per-workflow append-only history. Each Re-generate prepends a new DocEntry, and the intro page now renders the full history as a clickable list with a "Latest" badge on the most recent entry. Clicking a row loads that report into the doc view; "Back" returns to the intro where the list now includes the new entry. Reopen behavior is unchanged: if the last view was doc, the drawer opens to the most recent entry; otherwise it opens to the intro. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Generate New on the intro page is the single entry point to a fresh report now that the intro shows the full history. The in-place Re-generate button just duplicated that action, so remove it; the doc view header is now back + timestamp + Copy Markdown. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Show a Cancel button below the stopwatch in the loading state. Clicking it unsubscribes the in-flight HttpClient observable (which cancels the underlying XHR), the service's finalize handler then deletes the temp agent on the server so no orphan is left behind. After cancel the panel falls back to the most recent history entry, or to the intro view if no history exists, and shows a confirmation notification. Also scope .doc-loading's 32px icon rule to direct-child <i> so it no longer enlarges the icon inside the cancel button. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Add a trash-icon button on each history row that prompts a popconfirm before removing the entry from both the panel's local copy and the service's canonical history. If the entry being viewed is the one deleted, the panel falls back to the next most recent entry or the intro view. generateDocumentation now emits the canonical DocEntry instead of a plain markdown string so service-side and panel-side history share references and reference-equality deletion works. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Track the active NzDrawerRef on MenuComponent and bail early if it already exists. Clear the ref via afterClose so subsequent clicks reopen the drawer normally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Browsers throttle setInterval (and rxjs interval) when the tab is backgrounded, so the stopwatch counter would lag or freeze. Derive the displayed elapsed time from wall-clock subtraction instead and let the interval just trigger re-renders. When the tab regains focus, the next tick reads Date.now() and the display catches up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Add a compare mode to the history list: clicking "Compare" turns each row into a multi-select with checkboxes. Once two reports are selected, "Compare selected" opens an NzModal that sits on top of the drawer showing a side-by-side diff of the raw markdown. The diff uses diff.diffLines for line-level chunks and diff.diffWordsWithSpace within each removed/added pair so only the actually-changed words are highlighted instead of the entire line. Header bar shows base/head timestamps and -removed/+added line counts. This feature is intentionally an MVP and has plenty of room to grow: - collapse large unchanged sections (git-style "show N unchanged lines" expanders) so long reports stay scannable - make texera:op:OPERATOR_ID links clickable within the diff to reuse the existing canvas pan/highlight flow - keyboard j/k navigation between change hunks; sticky section header while scrolling - a "Reports are functionally identical (whitespace only)" hint when line-diff finds no semantic changes - toggle between side-by-side and unified layouts; toggle for ignore-whitespace mode - semantic awareness of markdown links so a re-emitted operator ID doesn't get flagged when only the surrounding prose changed The current implementation diffs raw markdown text, so reformatting (wrapping, list reorder) shows up as changes even when the meaning is the same. Acceptable for a demo; the items above would tighten it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Add an Edit mode to the doc view that swaps the rendered markdown for a textarea. Save overwrites the entry in place: updates markdown and generatedAt, marks the entry edited=true, and the timestamp label flips from "Generated" to "Updated" with an orange "edited" pill. Navigating away from an active edit with unsaved changes (Back, View another entry) pops an NzModal confirming "Discard unsaved changes?" before proceeding; nzOnOk forces detectChanges because the modal runs in a CDK portal outside the drawer's CD context. Editing state persists across drawer close/reopen via a new editingStates map keyed by workflow ID on WorkflowDocService. The panel pushes updates on every keystroke (ngModelChange) and clears the state on save/cancel/discard. Deleting the entry being edited also clears the orphan editing state. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Add a "Write Your Own" button on the intro page next to "Generate New" that drops the user into the doc view in edit mode with an empty textarea. Saving creates a new history entry marked written=true (the timestamp label flips to "Written" and a purple "written" badge sits next to it, distinct from the orange "edited" badge on LLM docs that were later modified by the user). Drafting persists across drawer close/reopen alongside existing editing-state persistence: DocEditingState.entry is now nullable so an in-progress draft survives just like in-progress edits of existing entries. Cancelling a draft always returns to the intro, regardless of whether history exists. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Each history row now uses an icon that mirrors the action that produced it: blue thunderbolt for AI-generated reports, purple edit pencil for user-written ones. Hover tooltip says "AI-generated" or "Written by you" to remove ambiguity. The existing written/edited badges in the doc-view header remain to reinforce the source once an entry is open. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Drop the standalone "written" pill and color the entire timestamp line purple for written entries and blue for AI-generated/updated ones, so the source signal lives where the user is already reading. The orange "edited" pill still appears beside the timestamp on LLM docs that the user later modified. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Workflow doc history, last-view, and editing state now survive page refresh and browser restart. All three are serialized under a single versioned key (texera.workflowDoc.v1) and written after every mutation on WorkflowDocService. The constructor loads on startup and revives generatedAt ISO strings back to Date objects. To make the editing-state restore work across sessions, DocEntry gains a stable id (crypto.randomUUID with a timestamp fallback) and DocEditingState.entry becomes entryId. The panel restores the in- progress entry by looking it up in history by id; isViewingEntry also switches to id-based equality so it survives a JSON round-trip cleanly. localStorage availability and quota failures are swallowed — the feature degrades to the previous in-memory behavior rather than crashing. No cross-tab sync in v1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Cancelling an in-flight LLM generation previously fell back to the most recent history entry if any existed, but the user clicked Generate from the intro and expects to return there. Mirror the "Write Your Own" cancel behavior and always go back to the intro view on cancel. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The drawer's CDK overlay is anchored to the page, not to MenuComponent, so navigating away from the workspace left it hanging on top of the destination route. Close docDrawerRef in MenuComponent.ngOnDestroy so the drawer disappears with the rest of the workspace UI. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Persisting lastView across workflow open/close cycles felt wrong — the user expects a fresh start on returning, not to resume mid-doc. Reset the workflow's lastView to "intro" in MenuComponent.ngOnDestroy so the next open lands on the intro page. In-progress editing state is still preserved so an accidental navigation does not silently destroy a draft or edit; the editing-state branch in the menu still routes straight back to doc view in that case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Add a copy icon next to the delete icon on each history row. Clicking it calls a new WorkflowDocService.duplicateEntry that creates a fresh DocEntry with a new id and timestamp, the source markdown verbatim, and the original entry's source flags (written/edited) preserved so the duplicate keeps the same icon and badge in the list. The duplicate lands at the top of history with a "Report duplicated" notification; the user stays on the intro so they can immediately duplicate, view, or delete other entries. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Add an optional title field to DocEntry. Each history row gets a pencil icon that swaps the row content for an inline text input; the doc-view header gains a title row above the timestamp with the same inline-edit affordance. Enter or blur saves, Escape cancels, empty titles unset the field rather than storing the empty string. Renames persist through WorkflowDocService.renameEntry and survive page refresh. Knock-on tweaks: - duplicateEntry carries the source title with a "(copy)" suffix - diff modal header shows each report's title (falling back to "Base" / "Head") so comparing renamed reports reads naturally - doc-history-time gets ellipsis truncation so long titles don't blow out the row Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Make the intro section fill the drawer body via flex layout so the history list can claim leftover vertical space and scroll on its own (overflow-y: auto + flex: 1 + min-height: 0). The fixed pieces — icon, title, description, features list, tip, and action buttons — stay put no matter how many reports are in history. Bump the list's outer border from #f0f0f0 to #bfbfbf so it reads as a defined container rather than fading into the drawer background. Same gray used by the chevron and rename icons for visual consistency. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Saving an edit now reorders the entry to position 0 in both the service's canonical history and the panel's local history copy, so the "Latest" badge follows the most-recently-touched report. Renames do not reorder since they only change metadata. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Add an Angular animations trigger on .doc-panel keyed on the current view. intro => doc slides the doc view in from the right with a fade; doc => intro slides the intro in from the left with a fade. 250ms ease-out — snappy enough not to feel like a delay. Initial drawer open has no transition for void => * so the inner animation does not fight the drawer's own slide-in. The doc view's children now live inside a .doc-view-container div (replacing the bare ng-container) so the animation has a real host element to query :enter on. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replace the Copy Markdown button with a Download button that writes the rendered report's raw markdown as a .md file via file-saver. The filename is built as "<Workflow Name> - <Report Title>.md", or "<Workflow Name> - workflow-doc-<ISO timestamp>.md" when the entry has no custom title; filesystem-illegal characters are sanitized to underscores and the total length is capped at 160. The copied/check toggle state is gone since downloads do not need that confirmation pulse. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

zyratlo and others added 30 commits May 14, 2026 17:56

fix(agent-service): guard comment box comments against non-string values

28b64b7

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-actions Bot assigned zyratlo May 16, 2026

github-actions Bot added dependencies Pull requests that update a dependency file frontend Changes related to the frontend GUI agent-service labels May 16, 2026

zyratlo changed the title ~~feat: AI workflow documentation tool with history, editing, and diff~~ [Hackathon] feat: AI workflow documentation tool with history, editing, and diff May 16, 2026

zyratlo marked this pull request as ready for review May 16, 2026 05:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Hackathon] feat: AI workflow documentation tool with history, editing, and diff #5088

[Hackathon] feat: AI workflow documentation tool with history, editing, and diff #5088
zyratlo wants to merge 31 commits into
apache:mainfrom
zyratlo:ryan-zhang-hackathon

zyratlo commented May 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

zyratlo commented May 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Demo Video

What changes were proposed in this PR?

Generating a report

Writing your own

Per-workflow history

Persistence

Files

Any related issues, documentation, discussions?

Was this PR authored or co-authored using generative AI tooling?

Future Work

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

zyratlo commented May 16, 2026 •

edited

Loading