feat: GitHub issue intelligence pipeline and dashboard#11
Draft
sophiecarreras wants to merge 2 commits into
Draft
feat: GitHub issue intelligence pipeline and dashboard#11sophiecarreras wants to merge 2 commits into
sophiecarreras wants to merge 2 commits into
Conversation
Adds the demand-side-ai issue intelligence app on top of the starter kit: - Backend pipeline: GitHub fetch -> OpenAI embeddings -> Anthropic classification -> HDBSCAN clustering -> SnapshotReport in B2 - Frontend: intelligence dashboard, snapshot list, issue list, cluster drill-down - CLI: intel:ingest, intel:reprocess, intel:list, intel:show - Docs: intelligence.md, intelligence-dashboard.md, storage-layout.md, RUNBOOK.md, exec plan Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…M JSON parsing - Replace OpenAI embeddings with sentence-transformers (all-MiniLM-L6-v2) — eliminates OpenAI dependency; embeddings run locally at zero API cost - Fix config env_file path: use Path(__file__).parents[4] so .env is found correctly regardless of working directory when CLI runs - Add extra="ignore" to Settings to tolerate intelligence-only keys - Remove model_dump_json(default=str) — Pydantic v2 handles datetime natively - Update default LLM model to claude-haiku-4-5-20251001 - Extract JSON from LLM responses before parsing to handle markdown code fences that the new model wraps around structured output Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implements backblaze-labs/demand-side-ai#176
What
Transforms the starter kit into a complete B2 Issue Intelligence sample app: a pipeline ingests GitHub issues, embeds and clusters them, classifies each with an LLM, and surfaces a dashboard showing backlog themes, category distribution, activity over time, and spec quality — all backed by Backblaze B2.
Why
Demonstrates B2 as a multi-role data store in a real AI/data pipeline: raw data lake (issue snapshots), derived-artifact store (embeddings, classifications, clusters), historical archive (append-only runs), and dashboard backend (report payloads). More compelling than a file-upload demo because artifacts accumulate naturally over time and the intelligence has real operational value.
Changes
Backend — intelligence pipeline
services/api/app/types/—Issue,ClassificationResult,Cluster,SnapshotReportPydantic modelsservices/api/app/config/intelligence.py— pipeline settings (repo, model, cost rates)services/api/app/repo/github_issues.py— GitHub REST adapter with pagination and rate-limit backoffservices/api/app/repo/embedding_client.py— local embeddings viasentence-transformers/all-MiniLM-L6-v2(no API key needed)services/api/app/repo/llm_client.py— genericcall_llm(system, user)wrapping Anthropic with retryservices/api/app/repo/intelligence_storage.py— all B2 read/write for raw + derived snapshot artifactsservices/api/app/service/— pipeline stages: ingestion, embeddings, classification, clustering, analysis, snapshotsservices/api/app/service/prompts/— versioned LLM prompt templates for classification and cluster labelingservices/api/app/runtime/routes_intelligence.py— REST API (POST /snapshots,GET /snapshots, reports, issues, clusters, activity)services/api/app/runtime/cli_intelligence.py— CLI entry points foringest,reprocess,list,showservices/api/main.py— registers intelligence routerservices/api/requirements.txt— addsanthropic,sentence-transformers,scikit-learn,numpy,pyarrowFrontend — intelligence dashboard
apps/web/src/components/intelligence/— 10 components:ClusterGrid,ClusterCard,CategoryBreakdown,ActivityTimeline,B2RoleDistribution,SpecDepthHistogram,IssueRow,IssueDetailPanel,SnapshotPicker,RunSnapshotButtonapps/web/src/app/intelligence/— routes: overview, snapshot list, snapshot detail, issue list, cluster drill-downapps/web/src/app/page.tsx— root redirects to intelligence overview (per AGENTS.md §2)apps/web/src/lib/api-client.ts— intelligence API functionsapps/web/src/lib/queries.ts— TanStack Query hooks with 3s polling for running snapshotspackages/shared/src/types.ts— shared TypeScript types mirroring Pydantic modelsapps/web/src/components/layout/app-sidebar.tsx— Intelligence and Snapshots nav entriesDocs
docs/features/intelligence.md— pipeline stages, cost model with worked example (~$0.19 for 169 issues), source adapter interfacedocs/features/intelligence-dashboard.md— component catalog, data hooks, UX states, privacy rulesdocs/features/storage-layout.md— B2 append-only snapshot layout with design rationaledocs/features/dashboard.md— updated to note root dashboard replaced by intelligence overviewdocs/RUNBOOK.md— operational guide: rate limits, LLM parse failures, snapshot cleanup, cluster re-labelingdocs/exec-plans/intelligence-v0.md— execution plan and acceptance criteriaAGENTS.md— §2a documents app-specific rules for agentsARCHITECTURE.md,README.md— updated for this appVerification
Stack: Next.js 16 · FastAPI · sentence-transformers (local embeddings) · Anthropic claude-haiku-4-5 · HDBSCAN · pyarrow · Backblaze B2