Skip to content

[Hackathon] feat: Workflow Hub — Community Gallery for Sharing and Forking Workflows#5113

Open
EmilySun621 wants to merge 4 commits into
apache:mainfrom
EmilySun621:hackathon/workflow-hub
Open

[Hackathon] feat: Workflow Hub — Community Gallery for Sharing and Forking Workflows#5113
EmilySun621 wants to merge 4 commits into
apache:mainfrom
EmilySun621:hackathon/workflow-hub

Conversation

@EmilySun621
Copy link
Copy Markdown

Discover, fork, and share data science workflows. A community-powered gallery where researchers publish their workflows and others can learn from, star, and fork them — like GitHub for data science pipelines.

The Problem

A new student joins Dr. Sarah's lab. She needs to build a diabetes prediction pipeline but has no idea where to start. There's no way to discover and reuse other people's workflows in Texera today.

The Solution

Workflow Hub — a browsable, searchable gallery of community workflows. Star your favorites, fork them into your workspace, publish your own.

What's New

Hub List Page (Sidebar → Hub → Workflow Hub)

  • Full-text search by name, tag, or description
  • Category filters: Biomedical, NLP, CV, Finance, EDA, Education, Tabular
  • Sort by: Trending, Most Stars, Most Forks, Recent
  • Featured workflows highlighted in a 3-column grid with DAG preview
  • Workflows generated by custom agents show an agent badge

Workflow Detail Page

  • Author avatar, name, publish date
  • Full description
  • DAG preview showing operators as connected boxes
  • Tags
  • Fork button, Star toggle, stats panel (stars, forks, views, operators)
  • Agent config card with "Import Agent Config" button (when applicable)

Fork — One-Click Reuse

Click Fork → new workflow created as "[Fork] Original Title" → opens in workspace with all operators pre-configured → user adds their own data and runs. Fork creates a real workflow in Texera's backend, not a localStorage copy.

Star — Save Favorites

Toggle star on any workflow. Star count updates instantly. Persists across sessions.

Publish — Share Your Work

Click "Publish Workflow" on the Hub page → select one of your saved workflows → add title, description, category, tags → operators auto-extracted from workflow content → published workflow appears in the Hub for everyone.

15 Seed Workflows — Never Empty

The Hub ships with curated community workflows:

Biomedical: Diabetes Prediction (CRISP-DM), Heart Disease, Breast Cancer, COVID-19 Clinical Trials
NLP: Sentiment Analysis, News Topic Classification
Finance: Credit Card Fraud, Stock Price Regression
CV: MNIST Digits
EDA: Movie Recommendation EDA, Air Quality
Education: UCI Iris Beginner, Titanic Survival, Wine Quality
Tabular: Census Income Prediction

Each seed includes title, author, description, category, tags, operator list, and star/fork/view counts.

Demo Flow

  1. Sidebar → "Workflow Hub" → 15 community workflows
  2. Filter "Biomedical" → see diabetes, heart disease, cancer workflows
  3. Click "Diabetes Prediction (CRISP-DM)"
  4. See description, DAG preview, 142 stars, 38 forks
  5. Click "Fork to My Workflows"
  6. "[Fork] Diabetes Prediction" opens in workspace with operators
  7. Configure CSV source with your own data → Run
  8. Click "Publish Workflow" → share your own workflow back to the Hub
Screenshot 2026-05-16 at 12 15 24 PM Screenshot 2026-05-16 at 12 15 29 PM Screenshot 2026-05-16 at 12 15 39 PM Screenshot 2026-05-16 at 12 16 08 PM

Files Changed

New files:

  • dashboard/component/user/workflow-hub/workflow-hub-list/ — Hub list page with search, sort, categories, cards
  • dashboard/component/user/workflow-hub/workflow-hub-detail/ — Detail page with DAG preview, fork, star, stats
  • dashboard/component/user/workflow-hub/workflow-hub-publish-dialog/ — Publish modal
  • dashboard/service/workflow-hub/workflow-hub.service.ts — Service with seed data, localStorage CRUD, star/fork logic

Modified files (additive only):

  • app-routing.module.ts — 2 new routes
  • app-routing.constant.ts — route constants
  • dashboard.component.html/ts — "Workflow Hub" sidebar link

Testing

  • Angular typecheck: clean
  • Seed data renders on first visit
  • Search filters by name/tag/description
  • Category chips filter correctly
  • Sort works (trending/stars/forks/recent)
  • Star toggle persists
  • Fork creates real workflow in backend
  • Forked workflow opens in editor with operators
  • Publish extracts operators from workflow content

Emily Sun and others added 4 commits May 15, 2026 21:55
This bundles the feature work that built up on this branch:

- Custom agents: dashboard CRUD page and editor dialog (48px icon tile,
  chip-style guardrails, model selector). Each custom agent now carries a
  LiteLLM model_name (Opus 4.7 / Haiku 4.5) that is passed through to the
  agent-service so different agents can use different models.

- Conversation history is scoped per (workflowId, agentId): switching
  agent or workflow yields a different conversation list. localStorage
  key: texera.workflowConversations.v1.{workflowId}.{agentId}.

- Time machine: workflow snapshot list, revert, and agent-tagged
  checkpoints. New workflow-history-tool in agent-service backs the
  "undo my last change" flow; amber gains a WorkflowSnapshotResource;
  sql/updates/23.sql adds the snapshot table.

- Operator-aware custom-agent prompts: the system prompt now injects the
  full operator catalog with a "prefer built-in operators over Python
  UDFs" rule, sourced from WorkflowSystemMetadata at request time.

- LiteLLM: added the claude-opus-4.7 entry alongside claude-haiku-4.5
  and gpt-5-mini in bin/litellm-config.yaml.

- Agent panel rewritten around the (conversation list / chat) two-view
  model with subscription-managed list reloads and per-step persistence.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a community gallery under /dashboard/hub/workflow-hub where users
browse, star, fork, and publish data science workflows. Backed by 15
seed entries and localStorage for stars/forks/views/publishes so the
page is never empty.

- List page: search, sort (trending/stars/forks/recent), category
  chips, featured grid, DAG-chain preview cards, agent badges.
- Detail page: SVG DAG preview, stats panel, fork-to-my-workflows
  (uses WorkflowPersistService.duplicateWorkflow when a backend wid is
  attached, otherwise falls back to a local stub), star toggle, and an
  optional 'Agent Included' card.
- Publish dialog: pulls the user's workflows via the persist service,
  derives operator chain from workflow content, writes a hub entry to
  localStorage.
- Sidebar: 'Workflow Hub' link added to the Hub submenu.
Seed entries don't have a workflowId, so the previous code only
incremented a localStorage counter and navigated to /dashboard/user/workflow
without actually writing to the backend — the forked workflow never showed
up in the Workflows page. Now the seed path calls
WorkflowPersistService.createWorkflow with empty content named
"[Fork] <title>", waits for the backend to return the new wid, and routes
straight into the new workflow's workspace. The duplicate-workflow path
for real-wid entries is unchanged.
The previous fix used createWorkflow with empty content, so forking a seed
entry produced a workflow with the right name but zero operators — the
"Executions doesn't exist" 403 the user saw was just the workspace trying
to load nonexistent executions for an empty workflow.

Now seed entries carry a sampleOperators field listing REAL Texera operator
types from the running backend's metadata (verified against the 163
operators the deployed build exposes). When the user forks:

1. Wait for OperatorMetadataService to publish the schema list.
2. For each known sampleOperators type, build a proper OperatorPredicate via
   WorkflowUtilService.getNewOperatorPredicate (which fills in ports, default
   properties, and the correct operatorVersion).
3. Connect consecutive operators by their first output→input ports.
4. Lay them out in a horizontal chain (200px apart).
5. POST to /workflow/create with the populated WorkflowContent and navigate
   to the new wid.

Any sampleOperators not present in the running build land in a single
comment box at the top of the canvas so the user can see what was intended.

For real (published) hub entries with a workflowId, the path is still
WorkflowPersistService.duplicateWorkflow — unchanged.
@github-actions github-actions Bot added engine ddl-change Changes to the TexeraDB DDL frontend Changes related to the frontend GUI dev common agent-service labels May 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent-service common ddl-change Changes to the TexeraDB DDL dev engine frontend Changes related to the frontend GUI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant