Skip to content

[Hackathon] feat: AI workflow documentation tool with history, editing, and diff #5088

Open
zyratlo wants to merge 31 commits into
apache:mainfrom
zyratlo:ryan-zhang-hackathon
Open

[Hackathon] feat: AI workflow documentation tool with history, editing, and diff #5088
zyratlo wants to merge 31 commits into
apache:mainfrom
zyratlo:ryan-zhang-hackathon

Conversation

@zyratlo
Copy link
Copy Markdown
Contributor

@zyratlo zyratlo commented May 16, 2026

Demo Video

https://drive.google.com/file/d/1PcP_N_hucajoNYRuUK9bxhqcg7jfvic9/view?usp=sharing

What changes were proposed in this PR?

Adds a new AI Workflow Documentation tool to the workspace, accessible from the existing menu bar via the document-text icon. The tool lives in a right-side nz-drawer that doesn't block the canvas, and supports three main flows: generating documentation with an LLM, writing it manually, and managing a per-workflow history of reports.

Generating a report

  • Click Generate Documentation on the intro page. The backend agent-service calls the LLM with a structured context of the current workflow (operator IDs, display names, types, descriptions, properties, UDF source code, compiled output schemas, link graph, and author comment-box notes) and a prompt instructing the model to produce sections for Purpose / Inputs / Pipeline Stages / Outputs / Caveats.
  • The prompt requires the LLM to format every operator mention as a markdown link of the form Display Name. Clicking one of these links in the rendered doc pans the canvas to center the operator and highlights it.
  • A centered spinner and stopwatch gives feedback during the wait. A Cancel button unsubscribes the request and cleans up the temp agent via the service's finalize handler.

Writing your own

  • Write Your Own opens the doc view in edit mode with a blank text area. Saving creates a new history entry tagged written: true.
  • An Edit button on any existing entry swaps the rendered markdown for a text area pre-filled with the current content. Saving updates the entry in place, marks it edited: true, refreshes its timestamp, and moves it to the top of the history. Source labels in the timestamp adjust accordingly ("Generated" → blue, "Updated" → blue + orange edited pill, "Written" → purple).
  • Both the editor and any in-progress draft are persisted, so closing the drawer mid-edit and reopening it lands the user back in the editor with their unsaved text intact. Navigating away from the entry (Back, viewing another report) pops a "Discard unsaved changes?" confirm.

Per-workflow history

  • Every generated/written/edited entry is recorded in a list on the intro page. Rows show the title (or timestamp fallback), a "Latest" badge on the newest, plus icons that mirror the report's origin (blue thunderbolt for AI, purple edit pencil for hand-written).
  • Each row supports Rename (inline editable title field, with the rename also exposed inside the doc view's header), Duplicate (carries title with a (copy) suffix), and Delete (with popconfirm). The list scrolls internally so long histories don't blow out the drawer.
  • A Compare mode lets the user pick any two entries and opens a side-by-side diff. The diff is line-level via diff.diffLines, but each (removed, added) pair runs through diffWordsWithSpace so only the actually-changed words light up — entirely rewritten lines are still uniformly colored. The compare modal shows entry titles (or "Base"/"Head") plus −N / +M totals.

Persistence

  • History, last-viewed-page-per-workflow, and the in-progress editing state are all serialized to localStorage under a single versioned key (texera.workflowDoc.v1). The constructor revives Date fields on load; every mutation writes back. DocEntry gets a stable id (crypto.randomUUID() with a timestamp fallback) so the editing-state restore can match entries across JSON round-trips. Quota / availability failures degrade silently to the previous in-memory behavior.

Files

  • New:
    • frontend/src/app/workspace/service/workflow-doc/workflow-doc.service.ts
    • frontend/src/app/workspace/component/workflow-doc-panel/
    • frontend/src/app/workspace/component/workflow-doc-diff/
    • agent-service documentation prompt + endpoint.
  • Modified:
    • agent-service/src/agent/texera-agent.ts
    • agent-service/src/server.ts
    • frontend/src/app/workspace/component/menu/{menu.component.ts,menu.component.html}
    • frontend/src/app/workspace/component/workflow-editor/workflow-editor.component.ts
    • frontend/src/app/workspace/service/workflow-graph/model/workflow-graph.ts
    • frontend/src/app/app.module.ts (registers NzDrawerModule)
    • frontend/package.json (adds diff and file-saver types where needed).

Any related issues, documentation, discussions?

See Discussion #5059

Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Anthropic Claude Opus 4.7)

Future Work

  • Bidirectional report ↔ workflow editing — keep the report and the canvas in sync as both sides change. Editing an operator's description in the report could propagate back to the operator's properties on the canvas; adding or renaming an operator on the canvas could update the corresponding section in the latest report.
  • Include runtime information in the doc — augment the LLM context with execution metadata (per-operator runtime stats, row counts, errors, last execution time) so generated reports describe not just what the pipeline does structurally but how it actually behaves on the current data.
  • Richer compare view — collapse unchanged sections with git-style "show N unchanged lines" expanders; make texera:op: links clickable inside the diff to reuse the canvas pan/highlight flow; add per-section change summaries ("Inputs: unchanged, Pipeline Stages: 3 lines changed"); offer a unified-vs-side-by-side toggle and an "ignore whitespace" mode.
  • Floating, draggable, resizable panel — replace the right-side drawer with a movable, resizable window, so the tool no longer obscures operator details, the run button, or other right-side UI.

zyratlo and others added 30 commits May 14, 2026 17:56
Adds a one-click "Document workflow" action to the workflow editor menu.
Clicking the button sends the current workflow content to agent-service,
which makes a single LLM call using WORKFLOW_DOCUMENTATION_PROMPT and
returns structured markdown covering purpose, inputs, pipeline stages,
outputs, and caveats. The result renders in a modal panel with a
copy-to-clipboard action.

Backend: new documentWorkflow() method on TexeraAgent, POST
/agents/:id/document-workflow endpoint, and prompt in prompts.ts.
Frontend: WorkflowDocService (creates a temporary agent, calls the
endpoint, cleans up), WorkflowDocPanelComponent (renders markdown via
MarkdownService), and a menu button that drives the flow.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…data, schemas, and author notes

Replaces the raw JSON prompt input with a structured context assembled by
buildDocumentationContext(). For each operator the context now includes:
the human-readable type description from WorkflowSystemMetadata, inlined
Python/R UDF code, and per-port output schemas from compileWorkflowAsync().
Data-flow links are expressed using display names. Comment boxes are
included as author notes. Compilation failures are handled gracefully.

Updates WORKFLOW_DOCUMENTATION_PROMPT to expect the richer structured
input and instructs the model to use schema and code information when
describing inputs, outputs, and pipeline stages.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
First click on the document button opens cached documentation instantly
if it exists; subsequent opens skip the LLM call. The modal now shows the
generation timestamp and a Re-generate button that reruns the LLM in-place
without closing and reopening the dialog.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Make Re-generate the primary call-to-action and demote Copy Markdown to a
secondary outlined button. Move the generation timestamp to the left of the
header and add a divider so the actions read as a toolbar above the content.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Clicking the document button now opens an intro view that explains what the
tool produces before any LLM call is made. The intro exposes "Generate
Documentation" (or "Generate New" + "View Latest" when a cached doc exists)
and transitions to the doc view, which keeps the existing copy and
re-generate actions and adds a back button to return to the intro.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add a click-to-navigate flow from the AI-generated workflow doc to the
canvas. The LLM now emits operator references as markdown links of the
form [Display Name](texera:op:OPERATOR_ID); the doc panel intercepts
clicks on those links, validates the ID against the live graph, and
fires a new centerOnOperator event on WorkflowGraph. WorkflowEditor
subscribes, pans the JointJS paper to center the target operator, and
highlights it using the existing highlight stream.

The agent context now exposes each operator's ID alongside its display
name so the model can produce valid links, and operator-reference links
get a distinctive pill-style render so users notice they are clickable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Copy and Re-generate buttons are meaningless before any markdown
exists, and showing the Re-generate spinner during the initial run was
confusing because the user clicked "Generate", not "Re-generate". Hide
the action group entirely until rawMarkdown is populated; the centered
loading spinner is the only progress indicator on the first run.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the centered modal with a 520px-wide drawer (NzDrawerService,
nzMask: false) so the canvas stays visible and interactive while the
documentation is open. Clicking an operator link in the report now pans
the canvas and highlights the operator without the user having to close
anything first.

Side changes wired in to support drawer hosting:
- WorkflowDocService now persists the last-viewed panel (intro vs doc)
  keyed by workflow ID, so reopening the drawer returns to whichever
  view the user was on when they closed it.
- WorkflowDocPanelComponent reads its initial view from the modal data
  and routes every view transition through a setView helper that fires
  an onViewChange callback back to the service.
- Inject ChangeDetectorRef and call detectChanges() after async state
  updates: the drawer renders content through a CDK overlay portal
  which does not propagate Angular's default change detection the way
  the modal did, so the spinner was previously stuck on success.
- Drop the hard-coded max-height: 60vh on .doc-content and switch the
  panel to flex layout so it fills the drawer body instead of leaving
  a dead band of whitespace below the content.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
LLM doc generation can take 30+ seconds and previously the only feedback
was an indeterminate spinner. Add a m:ss elapsed-time pill under the
spinner that ticks every second so the user knows the request is still
in flight. The timer starts on generate() and stops on success or
failure; uses tabular-nums so the digits do not jitter.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the single-entry doc cache with a per-workflow append-only
history. Each Re-generate prepends a new DocEntry, and the intro page
now renders the full history as a clickable list with a "Latest" badge
on the most recent entry. Clicking a row loads that report into the doc
view; "Back" returns to the intro where the list now includes the new
entry.

Reopen behavior is unchanged: if the last view was doc, the drawer
opens to the most recent entry; otherwise it opens to the intro.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Generate New on the intro page is the single entry point to a fresh
report now that the intro shows the full history. The in-place
Re-generate button just duplicated that action, so remove it; the doc
view header is now back + timestamp + Copy Markdown.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Show a Cancel button below the stopwatch in the loading state.
Clicking it unsubscribes the in-flight HttpClient observable (which
cancels the underlying XHR), the service's finalize handler then
deletes the temp agent on the server so no orphan is left behind.
After cancel the panel falls back to the most recent history entry,
or to the intro view if no history exists, and shows a confirmation
notification.

Also scope .doc-loading's 32px icon rule to direct-child <i> so it
no longer enlarges the icon inside the cancel button.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add a trash-icon button on each history row that prompts a popconfirm
before removing the entry from both the panel's local copy and the
service's canonical history. If the entry being viewed is the one
deleted, the panel falls back to the next most recent entry or the
intro view.

generateDocumentation now emits the canonical DocEntry instead of a
plain markdown string so service-side and panel-side history share
references and reference-equality deletion works.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Track the active NzDrawerRef on MenuComponent and bail early if it
already exists. Clear the ref via afterClose so subsequent clicks
reopen the drawer normally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Browsers throttle setInterval (and rxjs interval) when the tab is
backgrounded, so the stopwatch counter would lag or freeze. Derive
the displayed elapsed time from wall-clock subtraction instead and
let the interval just trigger re-renders. When the tab regains
focus, the next tick reads Date.now() and the display catches up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add a compare mode to the history list: clicking "Compare" turns each
row into a multi-select with checkboxes. Once two reports are selected,
"Compare selected" opens an NzModal that sits on top of the drawer
showing a side-by-side diff of the raw markdown. The diff uses
diff.diffLines for line-level chunks and diff.diffWordsWithSpace within
each removed/added pair so only the actually-changed words are
highlighted instead of the entire line. Header bar shows base/head
timestamps and -removed/+added line counts.

This feature is intentionally an MVP and has plenty of room to grow:
- collapse large unchanged sections (git-style "show N unchanged lines"
  expanders) so long reports stay scannable
- make texera:op:OPERATOR_ID links clickable within the diff to reuse
  the existing canvas pan/highlight flow
- keyboard j/k navigation between change hunks; sticky section header
  while scrolling
- a "Reports are functionally identical (whitespace only)" hint when
  line-diff finds no semantic changes
- toggle between side-by-side and unified layouts; toggle for
  ignore-whitespace mode
- semantic awareness of markdown links so a re-emitted operator ID
  doesn't get flagged when only the surrounding prose changed

The current implementation diffs raw markdown text, so reformatting
(wrapping, list reorder) shows up as changes even when the meaning is
the same. Acceptable for a demo; the items above would tighten it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add an Edit mode to the doc view that swaps the rendered markdown for
a textarea. Save overwrites the entry in place: updates markdown and
generatedAt, marks the entry edited=true, and the timestamp label
flips from "Generated" to "Updated" with an orange "edited" pill.

Navigating away from an active edit with unsaved changes (Back, View
another entry) pops an NzModal confirming "Discard unsaved changes?"
before proceeding; nzOnOk forces detectChanges because the modal runs
in a CDK portal outside the drawer's CD context.

Editing state persists across drawer close/reopen via a new
editingStates map keyed by workflow ID on WorkflowDocService. The
panel pushes updates on every keystroke (ngModelChange) and clears
the state on save/cancel/discard. Deleting the entry being edited
also clears the orphan editing state.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add a "Write Your Own" button on the intro page next to "Generate New"
that drops the user into the doc view in edit mode with an empty
textarea. Saving creates a new history entry marked written=true (the
timestamp label flips to "Written" and a purple "written" badge sits
next to it, distinct from the orange "edited" badge on LLM docs that
were later modified by the user).

Drafting persists across drawer close/reopen alongside existing
editing-state persistence: DocEditingState.entry is now nullable so an
in-progress draft survives just like in-progress edits of existing
entries. Cancelling a draft always returns to the intro, regardless of
whether history exists.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Each history row now uses an icon that mirrors the action that
produced it: blue thunderbolt for AI-generated reports, purple edit
pencil for user-written ones. Hover tooltip says "AI-generated" or
"Written by you" to remove ambiguity. The existing written/edited
badges in the doc-view header remain to reinforce the source once
an entry is open.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drop the standalone "written" pill and color the entire timestamp line
purple for written entries and blue for AI-generated/updated ones, so
the source signal lives where the user is already reading. The orange
"edited" pill still appears beside the timestamp on LLM docs that the
user later modified.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Workflow doc history, last-view, and editing state now survive page
refresh and browser restart. All three are serialized under a single
versioned key (texera.workflowDoc.v1) and written after every mutation
on WorkflowDocService. The constructor loads on startup and revives
generatedAt ISO strings back to Date objects.

To make the editing-state restore work across sessions, DocEntry gains
a stable id (crypto.randomUUID with a timestamp fallback) and
DocEditingState.entry becomes entryId. The panel restores the in-
progress entry by looking it up in history by id; isViewingEntry also
switches to id-based equality so it survives a JSON round-trip cleanly.

localStorage availability and quota failures are swallowed — the
feature degrades to the previous in-memory behavior rather than
crashing. No cross-tab sync in v1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cancelling an in-flight LLM generation previously fell back to the most
recent history entry if any existed, but the user clicked Generate from
the intro and expects to return there. Mirror the "Write Your Own"
cancel behavior and always go back to the intro view on cancel.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The drawer's CDK overlay is anchored to the page, not to MenuComponent,
so navigating away from the workspace left it hanging on top of the
destination route. Close docDrawerRef in MenuComponent.ngOnDestroy so
the drawer disappears with the rest of the workspace UI.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Persisting lastView across workflow open/close cycles felt wrong — the
user expects a fresh start on returning, not to resume mid-doc. Reset
the workflow's lastView to "intro" in MenuComponent.ngOnDestroy so the
next open lands on the intro page. In-progress editing state is still
preserved so an accidental navigation does not silently destroy a
draft or edit; the editing-state branch in the menu still routes
straight back to doc view in that case.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add a copy icon next to the delete icon on each history row. Clicking
it calls a new WorkflowDocService.duplicateEntry that creates a fresh
DocEntry with a new id and timestamp, the source markdown verbatim,
and the original entry's source flags (written/edited) preserved so
the duplicate keeps the same icon and badge in the list. The
duplicate lands at the top of history with a "Report duplicated"
notification; the user stays on the intro so they can immediately
duplicate, view, or delete other entries.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add an optional title field to DocEntry. Each history row gets a pencil
icon that swaps the row content for an inline text input; the doc-view
header gains a title row above the timestamp with the same inline-edit
affordance. Enter or blur saves, Escape cancels, empty titles unset the
field rather than storing the empty string. Renames persist through
WorkflowDocService.renameEntry and survive page refresh.

Knock-on tweaks:
- duplicateEntry carries the source title with a "(copy)" suffix
- diff modal header shows each report's title (falling back to
  "Base" / "Head") so comparing renamed reports reads naturally
- doc-history-time gets ellipsis truncation so long titles don't
  blow out the row

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Make the intro section fill the drawer body via flex layout so the
history list can claim leftover vertical space and scroll on its own
(overflow-y: auto + flex: 1 + min-height: 0). The fixed pieces — icon,
title, description, features list, tip, and action buttons — stay put
no matter how many reports are in history.

Bump the list's outer border from #f0f0f0 to #bfbfbf so it reads as a
defined container rather than fading into the drawer background. Same
gray used by the chevron and rename icons for visual consistency.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Saving an edit now reorders the entry to position 0 in both the
service's canonical history and the panel's local history copy, so
the "Latest" badge follows the most-recently-touched report. Renames
do not reorder since they only change metadata.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add an Angular animations trigger on .doc-panel keyed on the current
view. intro => doc slides the doc view in from the right with a fade;
doc => intro slides the intro in from the left with a fade. 250ms
ease-out — snappy enough not to feel like a delay. Initial drawer
open has no transition for void => * so the inner animation does not
fight the drawer's own slide-in.

The doc view's children now live inside a .doc-view-container div
(replacing the bare ng-container) so the animation has a real host
element to query :enter on.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the Copy Markdown button with a Download button that writes
the rendered report's raw markdown as a .md file via file-saver. The
filename is built as "<Workflow Name> - <Report Title>.md", or
"<Workflow Name> - workflow-doc-<ISO timestamp>.md" when the entry
has no custom title; filesystem-illegal characters are sanitized to
underscores and the total length is capped at 160. The copied/check
toggle state is gone since downloads do not need that confirmation
pulse.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added dependencies Pull requests that update a dependency file frontend Changes related to the frontend GUI agent-service labels May 16, 2026
@zyratlo zyratlo changed the title feat: AI workflow documentation tool with history, editing, and diff [Hackathon] feat: AI workflow documentation tool with history, editing, and diff May 16, 2026
@zyratlo zyratlo marked this pull request as ready for review May 16, 2026 05:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent-service dependencies Pull requests that update a dependency file frontend Changes related to the frontend GUI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant