Skip to content

[Hackathon] Interactive and Explainable Result Pane#5099

Open
tanishqgandhi1908 wants to merge 1 commit into
apache:mainfrom
tanishqgandhi1908:feat/interactive-result-pane
Open

[Hackathon] Interactive and Explainable Result Pane#5099
tanishqgandhi1908 wants to merge 1 commit into
apache:mainfrom
tanishqgandhi1908:feat/interactive-result-pane

Conversation

@tanishqgandhi1908
Copy link
Copy Markdown

@tanishqgandhi1908 tanishqgandhi1908 commented May 16, 2026

video submission -

https://youtu.be/15F18ZQe-OA

Motivation

Texera's result pane has historically been a static, page-by-page table viewer with a default page size of five rows. Users could glance at operator outputs and search by column name, but they could not interact with the data the way they would in a modern spreadsheet tool — no row-level filtering, no sorting, no full-data search, and no way to see, at a glance, what an operator actually did to its input.

That meant every debugging or exploration session looked roughly the same:

  1. Click an operator.
  2. Click through paginated pages.
  3. Switch to the upstream operator.
  4. Click through its pages.
  5. Mentally diff the two in your head.

It worked, but it was slow, fiddly, and easy to get wrong on wide or large tables.

This PR rethinks the result pane around two ideas:

  1. Treat the result pane like a spreadsheet, not a static table. Sort, filter, search, reorder columns, hide columns, pin columns — all without leaving the operator. Make it work at scale by pushing the heavy lifting down to Iceberg.
  2. Make every operator self-explanatory. Right above the data, show the user what changed compared to the upstream operator: row delta, column delta, schema diff. So instead of mentally diffing two tables, you see the diff inline.

Interactive result pane with pagination

Screenshot 2026-05-16 at 11 51 41 AM

Search rows

Screenshot 2026-05-16 at 11 52 13 AM

Search with text in columns

Screenshot 2026-05-16 at 11 53 00 AM

Whats changed since last operator

Screenshot 2026-05-16 at 11 55 17 AM

What changed (story version)

Phase 1 — From nz-table to ag-grid Community

The old nz-table view rendered every column to the DOM and capped at five rows per page. That worked for toy data but felt cramped, didn't sort or filter, and couldn't survive a 200-column table.

We swapped it for ag-grid Community (MIT-licensed, Apache-compatible) using the Infinite Row Model wired into Texera's existing WebSocket pagination protocol via a custom IDatasource. Out of the box, the user now gets:

  • Sort + per-column filter menus
  • Column reorder via drag
  • Column hide/show via a toggle dropdown
  • Column pin (left/right) via header context menu
  • DOM column virtualization so 200-column tables render smoothly
  • Pagination with auto-fit page size — resize the dock and the page size adjusts to the visible space

The grid is themed against Texera's existing Ant Design palette (no garish ag-grid defaults), and the per-column stats (Min / Max / Non-Null / category %) that lived in the old header are restored via a custom header component — same data, better layout.

Phase 2 — Backend pushdown

Spreadsheet UX is only useful if it scales. Texera stores operator results as Iceberg / Parquet, which can prune entire data files by partition + min/max stats and push predicates into the Parquet reader. We extended the protocol and the storage layer to take advantage of that:

  • ResultPaginationRequest now carries optional filters, sorts, and rowSearch fields.
  • VirtualDocument gains getRangeWithQuery + countWithQuery (defaulted to safe fallbacks so non-Iceberg documents keep working).
  • A new IcebergPredicateBuilder translates the wire-format ColumnFilter into Iceberg Expressions with type-aware value parsing per column type (no silent string-coercion bugs).
  • IcebergDocument implements both methods: predicate pushdown for ops Iceberg supports natively, residual evaluation in memory for contains / endsWith / rowSearch, and an in-memory sort capped at storage.result.sort.max-rows (default 100k).

When sort is requested but the matched count exceeds the cap, the backend returns rows in scan order with a sortSkipped flag, and the UI shows a friendly banner explaining how to narrow the filter. (Iceberg cannot push ORDER BY into the Parquet reader — sort is the one place we have to spend JVM heap.)

Phase 3 — Full-data row search

A debounced Search rows... input above the grid sends a rowSearch string down to the backend, which compiles it into a multi-column contains predicate over all string columns. This is the first real "search inside the data" experience in the result pane — the existing column-name search continues to work alongside it.

Phase 4 — The transformation diff

This is the most ambitious idea: every operator, at a glance, tells you what it did.

A compact strip above the grid renders:

  • Left pill: upstream operator name with its row count and column count (taken from the frontend's per-operator cache — no extra backend calls).
  • Middle: row delta (e.g. ↓ -149 rows (-99.3%), color-coded green/red/neutral) and column delta (e.g. +2 -1 ⇄1 cols or 5 cols · unchanged).
  • Right pill: current operator.

Click the strip and it expands inline (no popup) into a detail drawer with:

  • A two-row Before / After bar visualisation of row counts (scaled relative to the larger side, with the actual numbers right-aligned for clarity).
  • Coloured tag lists for Removed, Added, Type-changed, and Kept columns.

For source operators with no input, the strip shows a friendly ▶ Source operator chip. For multi-input operators (joins, unions), it collapses to ⛙ Combined from N inputs and defers the pairwise diff for a future iteration. All of this is computed from the data the frontend already maintains in WorkflowResultServicezero new backend round trips.

Layout — bottom dock instead of floating modal

The result panel itself was a draggable floating popup. We turned it into a fixed bottom dock: full viewport width, top-edge resize handle for height, no drag-to-move, no "return to corner" widget. Clicking a row no longer opens a modal — instead an inline row inspector slides in below the grid with a JSON tree view, prev/next/close, and visual selection on the corresponding row in the grid.

Architectural notes

  • Frontend memory is bounded regardless of dataset size — ag-grid's row + column virtualization keeps DOM at ~20–30 row nodes; the page cache evicts LRU at ~2 000 rows.
  • The frontend page cache is populated on response for the unfiltered fast path, so paging back and forth costs zero WS round-trips after the first visit.
  • Wire format stays backward-compatible: columnOffset / columnLimit / columnSearch are kept on ResultPaginationRequest with their defaults for the Python SDK and any external callers. New frontend simply stops setting them; the bare-minimum payload also avoids a Jackson edge case where JS Number.MAX_SAFE_INTEGER overflows Scala's Int.
  • Filter / sort / rowSearch fields are elided from the wire when empty, so the no-query path is byte-identical to the pre-PR shape.

🤖 Generated with Claude Code

…rmation diff

Upgrade the operator result pane from a static 5-row nz-table into an
interactive, full-dataset spreadsheet view with row-level filtering, sorting,
search, and a per-operator transformation summary.

Frontend
- Replace nz-table with ag-grid Community (MIT) using the Infinite Row Model;
  pagination with auto-fit page size; column reorder/hide/pin/resize.
- Custom header component restores inline column stats (Min / Max / Non-Null /
  category %).
- Per-cell renderer keeps image preview + hover-only download icon.
- Row inspector docks inline below the grid (replaces the prior popup modal):
  shows the row as a JSON tree with prev/next/close.
- Row search input above the grid, debounced 250 ms.
- Transformation diff strip above the grid: upstream-vs-current row delta and
  column diff (added / removed / kept / type-changed) with an expandable
  detail drawer.
- Result panel itself is now a fixed bottom dock (top-edge resize only) —
  no more floating popup.

Backend
- Extend ResultPaginationRequest with optional filters / sorts / rowSearch.
- New VirtualDocument query methods: getRangeWithQuery, countWithQuery
  (defaulted to no-op fallback).
- IcebergPredicateBuilder compiles ColumnFilter into Iceberg Expressions with
  type-aware value parsing (eq/ne/lt/le/gt/ge/startsWith/isNull/isNotNull/in
  pushed down; contains/endsWith handled as residual).
- IcebergDocument.getRangeWithQuery / countWithQuery: pushdown + residual
  filter + rowSearch + in-memory sort with a configurable cap
  (storage.result.sort.max-rows, default 100k); responses include a
  sortSkipped flag so the UI can prompt the user to narrow the filter.
- PaginatedResultEvent carries totalNumTuples and sortSkipped for the
  filtered/sorted path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added feature engine dependencies Pull requests that update a dependency file frontend Changes related to the frontend GUI common labels May 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

common dependencies Pull requests that update a dependency file engine feature frontend Changes related to the frontend GUI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant