[Hackathon] feat: Workflow performance profiler + agent-driven optimization#5098
Open
PG1204 wants to merge 14 commits into
Open
[Hackathon] feat: Workflow performance profiler + agent-driven optimization#5098PG1204 wants to merge 14 commits into
PG1204 wants to merge 14 commits into
Conversation
…or property editor sidebar, and profiler controls to main menu
…fo tooltips in the menu
…ons on the canvas
AI assisted workflow improvement suggestions
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Demo Video
https://drive.google.com/file/d/1rRaCWynJkJE6WtomWQiceh9KCH0Qc48M/view?usp=drive_link
What changes were proposed in this PR?
A user runs a workflow today and sees a few numeric stat badges on the canvas. They have no visual signal for which operator is slow, why it's slow, or what to do about it. This PR closes that loop end-to-end and then lets the AI agent participate in it.
Before / after
The story
Turn on the profiler (gauge icon in the run bar). The canvas paints itself — the Python UDF that takes most of the wall-clock turns red, everything else stays green. Hover over the red operator and a tooltip shows its runtime, throughput, and idle ratio. The property panel adds a "Profiler" section listing the fired hints (
RUNTIME_OUTLIER,LOW_PARALLELISM_HOT_OP, …) with plain-English messages.Hints that map to mechanical fixes also appear as ghost suggestions on the canvas: a "Bump workers" tag floats next to hot single-worker operators, and an "Insert Filter" ghost sits on edges where the rule engine sees an over-producing upstream. Click Apply and the change lands, with a "Run now" prompt so you can verify.

Want to compare runs? Download a profiler report and re-upload it later, or open the popover dropdown and pick directly from past executions — the existing delta heatmap and side-panel UI render from either source.

Open the agent chat and ask "is anything slow?" The agent calls

getProfilerSummaryandgetOptimizationHints, then surfaces a structured proposal that renders inline as an Apply / Reject card. The agent never mutates the workflow itself — the frontend's Apply button is the only mutation path. Multi-step optimizations come back as a numbered plan card with per-step Apply plus an "Apply All" button.The canvas ghosts themselves get smarter when the agent is available: clicking "Insert Filter" calls a
proposeFilterPredicateendpoint that reads the upstream schema and downstream context to fill in real{attribute, condition, value}rows instead of the rule-basedis not nullplaceholder. Similarly, "Bump workers" callsproposeWorkerCountto pick a number based on runtime and idle ratio. Both fall back to the static defaults on any miss, so the feature works with or without the agent running.On the backend, a new
ProfilerScoring.scalahelper mirrors the frontend's three scoring formulas so any future server-side use (persisted stats, scheduler decisions) stays consistent with the UI. No call sites yet — purely future-use infrastructure.Any related issues, documentation, discussions?
Related to the Apache Texera Agent Hackathon (#5059).
How was this PR tested?
258/258 frontend Vitest tests pass across 12 spec files; 147/147 agent-service Bun tests pass across 10 spec files; both tsc --noEmit clean; ng build succeeds. The Scala spec for ProfilerScoring was not run locally — the amber sbt project hits a pre-existing AddMetaInfLicenseFiles not found plugin error unrelated to this PR; CI is the canonical validator.
Manual end-to-end: built a CSVScan → Filter → heavy-Python-UDF → Visualize workflow, confirmed the heatmap reds the UDF; toggled all three views; uploaded a JSON report and confirmed delta heatmap; picked a past execution from the new dropdown and got the same result; asked the agent "is anything slow?" and confirmed the orange Apply/Reject card lands the change on the canvas; asked "what can we do to make this faster?" and confirmed the blue multi-step plan card renders with per-step Apply + Apply All; clicked Insert-Filter and Bump-Workers ghosts both with and without the agent running, confirming the fallback path.
Was this PR authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Opus 4.7)