Skip to content

feat(web): binary file attachments for Ask#1375

Draft
whoisthey wants to merge 7 commits into
whoisthey/text-file-attachmentsfrom
whoisthey/binary-file-attachments
Draft

feat(web): binary file attachments for Ask#1375
whoisthey wants to merge 7 commits into
whoisthey/text-file-attachmentsfrom
whoisthey/binary-file-attachments

Conversation

@whoisthey

@whoisthey whoisthey commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Adds support for binary (image) file attachments in Ask Sourcebot, building on the inline-text attachment work in the base branch. Users can attach PNG/JPEG/WebP/GIF images (drag-and-drop or file picker) to a chat message; the bytes are uploaded to app-mediated blob storage and sent to vision-capable models as native image content. Unlike text attachments, image bytes never travel in the messages JSON — they're referenced by id and served through an access-controlled route.

This is an enterprise (ee) feature, gated by the ask entitlement.

What's included

Data model & storage

  • New Attachment model + ChatAttachment join table + AttachmentStatus (PENDINGCOMMITTED) enum, with migration. Attachments are uploaded before any chat exists and linked to chats via ChatAttachment, keeping access purely chat-derived.
  • StorageBackend abstraction with a LocalFsStorageBackend (bytes under DATA_CACHE_DIR/attachments); an S3 driver is planned as a follow-up.
  • New blob variant on the AttachmentData discriminated union (references stored bytes by id; bytes stay out of message JSON).

Upload & serving

  • POST /api/ee/chat/attachments: authenticated (no anonymous uploads), entitlement-gated, validates content type by magic bytes (not client MIME/extension; SVG intentionally excluded), enforces a server-side size cap, and returns the attachmentId. Images upload on select.
  • Commit-on-send: commitMessageAttachments links referenced blobs to the chat and flips PENDING → COMMITTED, rejecting forged/unauthorized ids before the agent runs.
  • GET /api/ee/chat/{chatId}/attachments/{attachmentId}: serves bytes only when the caller can view the chat and a ChatAttachment link exists; sets X-Content-Type-Options: nosniff and a header-safe Content-Disposition.

Agent / model integration

  • Server-authoritative image capability resolution (from the models.dev catalog); the client signal is never trusted.
  • resolveLatestTurnImages loads bytes from storage for the latest user turn (only for blobs linked to that chat) and buildUserModelMessage attaches native image content parts. Image bytes are only sent on the turn they were added; a short marker is left when images are dropped (older turn or a text-only model).

Client UI

  • Image attach affordance is gated on the selected model's image capability; the file picker / drag-overlay accept includes image types only when supported.
  • Pending-image tray pill with thumbnail, upload status (uploading / uploaded / error), on-hover preview, and a full viewer dialog. Submit is blocked while uploads are in flight.
  • Preview cache so a just-sent image renders its preview instantly instead of 404ing on the serving route during the brief pre-commit window.

Lifecycle & cleanup

  • Orphan pruner (backend worker) periodically deletes PENDING (uploaded-but-never-sent) blobs older than a TTL, along with their bytes.
  • Deleting a chat sweeps blobs left with zero links; duplicating a chat copies the attachment links (metadata only, no byte copy).

Config / observability

  • New env vars: SOURCEBOT_CHAT_ATTACHMENT_MAX_IMAGE_BYTES (default 10 MiB) and SOURCEBOT_CHAT_ATTACHMENT_ORPHAN_TTL_HOURS (default 24, 0 disables). Documented in environment-variables.mdx.
  • New analytics: chat_attachment_uploaded, chat_attachment_degraded.

Summary by CodeRabbit

  • New Features

    • Added image attachments to chat, including upload, preview, and viewing support.
    • Chats can now include downloadable attachment files with proper access controls.
    • Attachment-related limits and settings are now configurable in environment variables.
  • Bug Fixes

    • Improved handling of attachment uploads and cleanup to reduce orphaned files.
    • Image attachments now respect model support, with clearer behavior when images aren’t available.
  • Documentation

    • Updated configuration docs with the new attachment environment settings.

@coderabbitai

coderabbitai Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Walkthrough

Adds end-to-end image attachment support to the chat feature. This includes a new Attachment/ChatAttachment DB schema, a local filesystem storage backend, upload (POST) and serve (GET) API routes, server-side attachment commit/validation helpers, multimodal image loading in the AI agent, a client-side upload/preview/transition flow in ChatBox and MessageAttachments, and a backend AttachmentPruner for orphaned PENDING blobs.

Changes

Chat Image Attachment Feature

Layer / File(s) Summary
DB schema, migration, env vars, and storage backend
packages/db/prisma/schema.prisma, packages/db/prisma/migrations/.../migration.sql, packages/shared/src/env.server.ts, packages/web/src/features/chat/attachments/storage.ts, packages/web/src/features/chat/attachments/attachmentPreviewCache.ts, docs/docs/configuration/environment-variables.mdx
Adds AttachmentStatus enum, Attachment and ChatAttachment Prisma models with cascading FKs and indexes, two new server env vars (SOURCEBOT_CHAT_ATTACHMENT_MAX_IMAGE_BYTES, SOURCEBOT_CHAT_ATTACHMENT_ORPHAN_TTL_HOURS), the StorageBackend interface and LocalFsStorageBackend with path-traversal guard, and a client-side object-URL preview cache.
Shared attachment types, validation, and filename utils
packages/web/src/features/chat/types.ts, packages/web/src/features/chat/attachments/validation.ts, packages/web/src/features/chat/attachments/filename.ts, packages/web/src/features/chat/constants.ts, packages/web/src/lib/posthogEvents.ts
Adds blobAttachmentSchema/BlobAttachment to the attachment discriminated union, magic-byte image validation (detectImageMediaType, validateImageAttachment), sanitizeFilename, per-image byte/count limits, MIME allowlist constants, and two new PostHog event schemas.
Server-side attachment lifecycle helpers
packages/web/src/features/chat/utils.server.ts, packages/web/src/features/chat/utils.ts
Adds resolveChatAccess (ownership/sharing/canView), commitMessageAttachments (validate PENDING blobs, link to chat, transition status in a transaction), deleteOrphanedAttachments (best-effort storage + DB delete), and getMessageTextBytes for inline text budgeting.
POST /api/ee/chat/attachments upload route
packages/web/src/app/api/(server)/ee/chat/attachments/route.ts
Enforces content-length guard, authenticates caller, validates image via magic bytes, uploads to storage, creates a PENDING DB row with rollback on DB failure, emits chat_attachment_uploaded analytics, and returns attachment metadata.
GET /api/ee/chat/[chatId]/attachments/[attachmentId] serve route
packages/web/src/app/api/(server)/ee/chat/[chatId]/attachments/[attachmentId]/route.ts
Verifies chat view permission and ChatAttachment link, stat-checks storage, streams bytes with Content-Type, byte-accurate Content-Length, RFC 5987 Content-Disposition, nosniff, and private cache headers.
Chat POST route: commit attachments, model image gating
packages/web/src/app/api/(server)/ee/chat/route.ts
Enforces per-turn inline text byte limit, calls commitMessageAttachments for the latest message, resolves model image capabilities, emits chat_attachment_degraded analytics when the model does not support images, and passes supportsImages into createMessageStream.
models.dev catalog cold-start blocking
packages/web/src/features/chat/modelsDevCatalog.server.ts, packages/web/src/features/chat/modelCapabilities.server.ts, packages/web/src/features/chat/modelCapabilities.server.test.ts
loadCatalog gains awaitWhenEmpty option that blocks with a Promise.race timeout; resolveModelCapabilities uses it to avoid falling back to text-only on cold starts; test updated to match.
Agent multimodal message building
packages/web/src/ee/features/chat/agent.ts
Adds resolveLatestTurnImages to load image blob bytes from storage for the latest user turn, buildUserModelMessage to construct ModelMessage content with conditional image parts and omission notes, and extends createMessageStream with supportsImages.
Client attachment utils: image types, upload, file reading
packages/web/src/features/chat/attachmentUtils.ts
Splits PendingAttachment into text/image variants, updates accept/dropzone helpers to gate on includeImages, rewrites readFilesAsAttachments with image count/size validation, and adds uploadImageAttachment to POST to the upload route.
ChatBox: image-aware upload, submit gating, preview stashing
packages/web/src/features/chat/components/chatBox/chatBox.tsx
Derives supportsImages from model modalities, immediately uploads image attachments, gates submission with an "uploading" reason, stashes preview URLs before onSubmit, revokes object URLs on removal, filters redirect-safe text-only attachments, and shows a capability warning.
ChatBox UI components: tray, button, viewer, dropzone
packages/web/src/features/chat/components/chatBox/attachmentTray.tsx, ...attachmentButton.tsx, ...attachmentViewerDialog.tsx, ...chatPaneDropzone.tsx
AttachmentTray renders image items with HoverCard thumbnails and upload/error overlays; AttachmentButton gates accept attribute and tooltip on acceptImages; AttachmentViewerDialog conditionally renders img vs pre; dropzone allows images and drops "Text files only" from rejection toast.
MessageAttachments: served image probing and preview transition
packages/web/src/ee/features/chat/components/chatThread/messageAttachments.tsx, ...chatThreadListItem.tsx
Accepts chatId, probes the serve route with bounded retries until the committed image loads, releases the local preview URL on success, and renders image blobs as hoverable thumbnails with the viewer receiving imageSrc.
Chat actions: attachment cleanup on delete/duplicate
packages/web/src/features/chat/actions.ts
getChatInfo uses resolveChatAccess; deleteChat fetches chatAttachment IDs then calls deleteOrphanedAttachments; duplicateChat copies chatAttachment rows to the new chat.
Backend AttachmentPruner scheduler
packages/backend/src/attachmentPruner.ts, packages/backend/src/index.ts
Batching pruner that periodically unlinks PENDING attachment blobs older than the configured TTL and deletes matching DB rows; disabled when TTL is 0; wired into backend startup and shutdown.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant ChatBox
  participant UploadRoute as POST /chat/attachments
  participant Storage as LocalFsStorageBackend
  participant DB as Prisma

  Client->>ChatBox: select image file
  ChatBox->>UploadRoute: POST formData(file)
  UploadRoute->>UploadRoute: validateImageAttachment (magic bytes)
  UploadRoute->>Storage: put(storageKey, buffer)
  UploadRoute->>DB: attachment.create(PENDING)
  UploadRoute-->>ChatBox: { attachmentId, filename, mediaType, sizeBytes }
  ChatBox->>ChatBox: status = uploaded, store previewUrl

  Client->>ChatBox: submit message
  ChatBox->>DB: commitMessageAttachments (PENDING→COMMITTED, create ChatAttachment)
  ChatBox->>ChatBox: setAttachmentPreviewUrl(attachmentId, objectUrl)

  Client->>MessageAttachments: renders BlobAttachment
  loop probe until served
    MessageAttachments->>Storage: GET /chat/:chatId/attachments/:attachmentId
    Storage-->>MessageAttachments: image bytes
  end
  MessageAttachments->>MessageAttachments: releaseAttachmentPreviewUrl → show served URL
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

  • sourcebot-dev/sourcebot#888: Both PRs modify duplicateChat in packages/web/src/features/chat/actions.ts; this PR adds attachment link copying while #888 made earlier structural changes to the same function.
  • sourcebot-dev/sourcebot#1209: Both PRs modify createMessageStream in packages/web/src/ee/features/chat/agent.ts and its CreateMessageStreamResponseProps contract.
  • sourcebot-dev/sourcebot#907: Both PRs extend PosthogEventMap in packages/web/src/lib/posthogEvents.ts with new chat-related event schemas.

Suggested reviewers

  • msukkari
  • brendan-kellam
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: binary file attachments for Ask in the web app.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch whoisthey/binary-file-attachments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

whoisthey and others added 5 commits June 27, 2026 12:10
- enforce per-message image count and per-turn text budget server-side
  (reject forged/oversized requests; mirror image-count cap client-side)
- serving route: 404 on missing bytes via storage.stat() before headers,
  and set Content-Length from the on-disk size
- allow SOURCEBOT_CHAT_ATTACHMENT_ORPHAN_TTL_HOURS=0 to disable the sweep
  (.positive -> .nonnegative, matching the pruner and docs)
- block only on the first (cold) models.dev fetch when resolving model
  capabilities, bounded by a short budget, so images aren't silently
  dropped right after a process start; airgapped pays one short wait
- release preview object URLs once the served image loads to bound memory
- carry only text attachments across the login/upgrade redirect

Co-authored-by: Cursor <cursoragent@cursor.com>
@whoisthey

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 28, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 17

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/backend/src/attachmentPruner.ts`:
- Around line 38-40: The startup call to pruneOrphanedAttachments() is firing
without any await or error handling, which can surface as an unhandled rejection
and crash the worker during initialization. Update the startup path in
attachmentPruner so the first prune is either awaited from an async
initialization flow or wrapped with explicit catch/log handling, while keeping
the recurring setIntervalAsync scheduling intact.
- Around line 55-80: The orphan cleanup in attachmentPruner’s batch loop deletes
files based only on the initial findMany result, so a PENDING attachment can be
unlinked even if it becomes linked or non-orphaned before deletion. Re-check
each attachment’s current state in the same batch before calling unlink, ideally
by verifying the row still matches the orphan criteria in this method before
file removal and before deleteMany. Use the existing attachmentPruner loop, the
db.attachment queries, and the unlink call to keep only still-orphaned
attachments eligible for byte deletion.

In
`@packages/db/prisma/migrations/20260627000032_add_chat_attachments/migration.sql`:
- Around line 42-43: The Attachment foreign key on uploadedById currently
cascades deletes from User, which causes committed attachments and their
ChatAttachment links to disappear when an uploader is removed. Update the
migration and the corresponding Prisma model for Attachment/uploadedById so
deletion of a User does not delete Attachment rows; use a non-cascading delete
behavior that preserves historical attachments for existing chats.

In
`@packages/web/src/app/api/`(server)/ee/chat/[chatId]/attachments/[attachmentId]/route.ts:
- Line 101: The attachment response in the route handler for chat attachments is
being cached too aggressively via the Cache-Control header. Update the
response-building logic in the attachment route so access-controlled content is
not reused for an hour by the browser; use a no-store/no-cache style policy (or
equivalent short-lived revalidation) for the attachment fetch path, especially
where the route checks authorization before returning the file.

In `@packages/web/src/app/api/`(server)/ee/chat/attachments/route.ts:
- Around line 18-24: The attachment size check in the chat upload route is still
bypassable because it only relies on content-length before calling
req.formData() and file.arrayBuffer() in the route handler. Update the
attachment handling flow in the route’s upload path to enforce an authoritative
limit after parsing by rejecting oversized File objects before buffering their
contents, and keep the existing maxImageBytes check as a secondary guard. Use
the existing attachment route logic and the file size handling around
req.formData(), file.arrayBuffer(), and maxImageBytes to ensure the request is
rejected even when content-length is missing or chunked.

In `@packages/web/src/app/api/`(server)/ee/chat/route.ts:
- Line 79: Reject empty message arrays in the chat route before accessing
latestMessage, since messages: [] currently passes validation and leaves
latestMessage undefined. Update the request handling in the route’s
message-processing logic around latestMessage to explicitly validate that
messages has at least one entry and return a typed 400 for empty arrays before
any downstream dereference of parts.
- Around line 98-104: Validate the model in the chat route before calling
commitMessageAttachments. In the route handler for the chat API, move the
languageModelConfig check ahead of the attachment commit so a bad model request
returns 400 before blobs are linked/flipped. Update the flow around
latestMessage, commitMessageAttachments, and languageModelConfig so attachments
are only persisted after the model request is confirmed.

In `@packages/web/src/ee/features/chat/agent.ts`:
- Around line 139-143: The omission note in agent.ts uses the wrong reason when
the latest user turn has images but all image reads fail. Update the logic
around the imageBlobs.length > 0 branch so the reason distinguishes between
unsupported image models, images added on a different turn, and images that were
added on this turn but could not be loaded. Use the existing identifiers
isLatestUserTurn, supportsImages, and imageBlobs to select the correct message
before appending to baseText.

In `@packages/web/src/features/chat/actions.ts`:
- Around line 306-322: Snapshot the source chat’s attachment links before
persisting the duplicate chat so the copy isn’t affected by a concurrent
delete/cascade. In the chat-duplication flow in actions.ts, read originalLinks
from prisma.chatAttachment.findMany using originalChat.id before creating
newChat, then create newChat and use the saved links in
prisma.chatAttachment.createMany. Keep the fix localized to the duplication
logic that handles originalChat, newChat, and chatAttachment.
- Around line 196-214: The attachment snapshot in the chat delete flow can miss
links created concurrently, causing orphaned blobs after deletion. Update the
delete path in actions.ts around the linkedAttachments fetch and
prisma.chat.delete call so orphan cleanup is based on a post-delete or
transaction-safe view of attachments, and ensure deleteOrphanedAttachments runs
with all attachmentIds still associated with the chat at the moment of deletion.
Use the existing deleteOrphanedAttachments helper and
prisma.chatAttachment/prisma.chat delete logic to keep the cleanup atomic or
re-read after delete before sweeping.

In `@packages/web/src/features/chat/attachments/filename.ts`:
- Around line 1-15: The sanitizeFilename helper only removes control characters
and whitespace, but it still allows markup-significant characters that can break
the <attachment filename="..."> boundary. Update sanitizeFilename to also strip
or escape characters like double quotes, angle brackets, and ampersands while
keeping the basename and existing fallback behavior intact.

In `@packages/web/src/features/chat/components/chatBox/chatBox.tsx`:
- Around line 281-298: The submit gating in chatBox should also catch image
attachments that are failed or malformed, not just `uploading`, because
`attachmentData` later drops those silently and can still allow an empty-text
send. Update the submit-disabled logic in the `chatBox` submit-state helper to
treat non-sendable image attachments the same as uploading ones, using the same
image attachment status checks that feed `attachmentData` so the UI blocks
submit whenever an image won’t actually be included.

In `@packages/web/src/features/chat/constants.ts`:
- Around line 21-24: ATTACHMENT_MAX_IMAGE_BYTES is hard-coded in constants.ts
even though the upload limit is configurable on the server, so the client can
diverge from the authoritative value. Update the client-side early-rejection
logic to source the max image bytes from the same configurable setting used by
the upload route (env.SOURCEBOT_CHAT_ATTACHMENT_MAX_IMAGE_BYTES) rather than a
fixed 10 MiB constant, and keep the existing chat attachment constants
synchronized with the server-facing limit.

In `@packages/web/src/features/chat/modelsDevCatalog.server.ts`:
- Around line 123-129: The cold-start gating in ModelsDevCatalog.server.ts only
uses hasAttempted derived from catalogFetchedAt and lastFailedAt, so it stays
false while inFlightFetch is still pending and allows repeated short waits.
Update the awaitWhenEmpty path to track the cold-start wait attempt separately
from fetch settlement, and use that flag in the condition around Promise.race so
only one process-wide COLD_START_BLOCK_BUDGET_MS wait can occur. Keep the
existing inFlightFetch and cachedCatalog behavior, but mark the short-wait as
attempted as soon as it is started.

In `@packages/web/src/features/chat/utils.server.ts`:
- Around line 222-242: The orphan cleanup in the attachment deletion flow needs
a final safety check before removing rows, because a concurrent relink can
happen after the initial lookup. Update the logic in the utility that computes
orphanedIds and calls prisma.attachment.deleteMany so the delete is conditional
on the attachment still having no chatAttachment references at delete time,
rather than deleting by bare id from the earlier snapshot.
- Around line 159-199: The attachment claim flow in the chat utility is
validating and then committing outside a single atomic check, so two concurrent
sends can both attach the same upload. Move the PENDING/ownership check into the
commit path in the `createMany`/`updateMany` transaction inside the chat
attachment helper in `utils.server.ts`, and ensure the `attachment` update only
succeeds when the row is still `AttachmentStatus.PENDING` and belongs to the
expected `userId`/`orgId`. If the conditional update affects fewer rows than
`idsToCommit`, treat it as an invalid request and do not create any
`chatAttachment` links.

In `@packages/web/src/lib/posthogEvents.ts`:
- Around line 207-218: The PostHog event schema for chat attachment events
currently leaves source optional, which allows indistinguishable cross-surface
emissions from the upload flow. Update the type definitions in posthogEvents.ts
for chat_attachment_uploaded and chat_attachment_degraded so source is required,
or alternatively rename these events to use the wa_ prefix if they are truly
web-only; make sure the emitting call sites match the chosen contract,
especially the new upload route, and keep the schema aligned with the intended
event origin.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 79aa38e4-8079-483a-9868-4da770dbc165

📥 Commits

Reviewing files that changed from the base of the PR and between cb8181d and ac42b73.

📒 Files selected for processing (31)
  • docs/docs/configuration/environment-variables.mdx
  • packages/backend/src/attachmentPruner.ts
  • packages/backend/src/index.ts
  • packages/db/prisma/migrations/20260627000032_add_chat_attachments/migration.sql
  • packages/db/prisma/schema.prisma
  • packages/shared/src/env.server.ts
  • packages/web/src/app/api/(server)/ee/chat/[chatId]/attachments/[attachmentId]/route.ts
  • packages/web/src/app/api/(server)/ee/chat/attachments/route.ts
  • packages/web/src/app/api/(server)/ee/chat/route.ts
  • packages/web/src/ee/features/chat/agent.ts
  • packages/web/src/ee/features/chat/components/chatThread/chatThreadListItem.tsx
  • packages/web/src/ee/features/chat/components/chatThread/messageAttachments.tsx
  • packages/web/src/features/chat/actions.ts
  • packages/web/src/features/chat/attachmentUtils.ts
  • packages/web/src/features/chat/attachments/attachmentPreviewCache.ts
  • packages/web/src/features/chat/attachments/filename.ts
  • packages/web/src/features/chat/attachments/storage.ts
  • packages/web/src/features/chat/attachments/validation.ts
  • packages/web/src/features/chat/components/chatBox/attachmentButton.tsx
  • packages/web/src/features/chat/components/chatBox/attachmentTray.tsx
  • packages/web/src/features/chat/components/chatBox/attachmentViewerDialog.tsx
  • packages/web/src/features/chat/components/chatBox/chatBox.tsx
  • packages/web/src/features/chat/components/chatBox/chatPaneDropzone.tsx
  • packages/web/src/features/chat/constants.ts
  • packages/web/src/features/chat/modelCapabilities.server.test.ts
  • packages/web/src/features/chat/modelCapabilities.server.ts
  • packages/web/src/features/chat/modelsDevCatalog.server.ts
  • packages/web/src/features/chat/types.ts
  • packages/web/src/features/chat/utils.server.ts
  • packages/web/src/features/chat/utils.ts
  • packages/web/src/lib/posthogEvents.ts

Comment on lines +38 to +40
// Run immediately on startup, then every hour.
this.pruneOrphanedAttachments();
this.interval = setIntervalAsync(() => this.pruneOrphanedAttachments(), ONE_HOUR_MS);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Handle the startup prune promise.

Line 39 kicks off pruneOrphanedAttachments() without awaiting or catching it. Any DB/filesystem failure there becomes an unhandled rejection, and this backend already exits on unhandledRejection, so the first prune can take the worker down during startup.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/backend/src/attachmentPruner.ts` around lines 38 - 40, The startup
call to pruneOrphanedAttachments() is firing without any await or error
handling, which can surface as an unhandled rejection and crash the worker
during initialization. Update the startup path in attachmentPruner so the first
prune is either awaited from an async initialization flow or wrapped with
explicit catch/log handling, while keeping the recurring setIntervalAsync
scheduling intact.

Comment on lines +55 to +80
const batch = await this.db.attachment.findMany({
where: {
status: AttachmentStatus.PENDING,
createdAt: { lt: cutoff },
},
select: { id: true, storageKey: true },
take: BATCH_SIZE,
});

if (batch.length === 0) {
break;
}

await Promise.all(batch.map(async (attachment) => {
try {
await unlink(path.join(this.attachmentsDir, attachment.storageKey));
} catch (error) {
if ((error as NodeJS.ErrnoException)?.code !== 'ENOENT') {
logger.warn(`Failed to delete bytes for orphaned attachment ${attachment.id}: ${error}`);
}
}
}));

const result = await this.db.attachment.deleteMany({
where: { id: { in: batch.map((attachment) => attachment.id) } },
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | 🏗️ Heavy lift

Re-check that a batch is still orphaned before deleting bytes.

This loop selects rows once as old PENDING attachments, but then Line 70 deletes bytes before any second status/link check and Line 78 deletes by id alone. If a user sends the message while this batch is in flight, you can unlink a now-committed attachment; even if the row deletion later fails on a new FK, the committed image is already broken.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/backend/src/attachmentPruner.ts` around lines 55 - 80, The orphan
cleanup in attachmentPruner’s batch loop deletes files based only on the initial
findMany result, so a PENDING attachment can be unlinked even if it becomes
linked or non-orphaned before deletion. Re-check each attachment’s current state
in the same batch before calling unlink, ideally by verifying the row still
matches the orphan criteria in this method before file removal and before
deleteMany. Use the existing attachmentPruner loop, the db.attachment queries,
and the unlink call to keep only still-orphaned attachments eligible for byte
deletion.

Comment on lines +42 to +43
-- AddForeignKey
ALTER TABLE "Attachment" ADD CONSTRAINT "Attachment_uploadedById_fkey" FOREIGN KEY ("uploadedById") REFERENCES "User"("id") ON DELETE CASCADE ON UPDATE CASCADE;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | 🏗️ Heavy lift

Keep committed attachments when the uploader is deleted.

Line 43's ON DELETE CASCADE removes the Attachment row, and Line 49 then cascades that into ChatAttachment. Deleting a user will therefore strip images out of every surviving chat they uploaded to. This FK should preserve committed attachment rows instead of deleting them.

Suggested direction
-    "uploadedById" TEXT NOT NULL,
+    "uploadedById" TEXT,

-ALTER TABLE "Attachment" ADD CONSTRAINT "Attachment_uploadedById_fkey" FOREIGN KEY ("uploadedById") REFERENCES "User"("id") ON DELETE CASCADE ON UPDATE CASCADE;
+ALTER TABLE "Attachment" ADD CONSTRAINT "Attachment_uploadedById_fkey" FOREIGN KEY ("uploadedById") REFERENCES "User"("id") ON DELETE SET NULL ON UPDATE CASCADE;

Update the Prisma model alongside the migration so historical chats can still resolve their attachments after uploader deletion.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@packages/db/prisma/migrations/20260627000032_add_chat_attachments/migration.sql`
around lines 42 - 43, The Attachment foreign key on uploadedById currently
cascades deletes from User, which causes committed attachments and their
ChatAttachment links to disappear when an uploader is removed. Update the
migration and the corresponding Prisma model for Attachment/uploadedById so
deletion of a User does not delete Attachment rows; use a non-cascading delete
behavior that preserves historical attachments for existing chats.

// Never let the browser sniff a different (potentially executable)
// content type from the bytes.
'X-Content-Type-Options': 'nosniff',
'Cache-Control': 'private, max-age=3600',

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security & Privacy | 🟠 Major | ⚡ Quick win

Do not cache access-controlled attachments for an hour.

Line 101's private, max-age=3600 lets the browser reuse the image after logout, share revocation, or a visibility change without rechecking this route. For sensitive chat attachments, that weakens the access control you enforce above.

Suggested fix
-            'Cache-Control': 'private, max-age=3600',
+            'Cache-Control': 'private, no-store',
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
'Cache-Control': 'private, max-age=3600',
'Cache-Control': 'private, no-store',
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@packages/web/src/app/api/`(server)/ee/chat/[chatId]/attachments/[attachmentId]/route.ts
at line 101, The attachment response in the route handler for chat attachments
is being cached too aggressively via the Cache-Control header. Update the
response-building logic in the attachment route so access-controlled content is
not reused for an hour by the browser; use a no-store/no-cache style policy (or
equivalent short-lived revalidation) for the attachment fetch path, especially
where the route checks authorization before returning the file.

Comment on lines +18 to +24
// Reject obviously-oversized bodies before reading them into memory. The
// multipart envelope adds some overhead beyond the raw bytes, so allow a
// 1 MiB slack on top of the image cap; the exact byte cap is re-checked
// against the decoded buffer below.
const maxImageBytes = env.SOURCEBOT_CHAT_ATTACHMENT_MAX_IMAGE_BYTES;
const contentLength = Number(req.headers.get('content-length') ?? 0);
if (contentLength > maxImageBytes + 1024 * 1024) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

The size guard is bypassable before formData() buffers the body.

This only trusts content-length. If the request arrives without that header (or via chunked transfer), the route still calls req.formData() and then file.arrayBuffer(), so an oversized upload is fully buffered in memory before any authoritative limit is enforced.

Immediate mitigation
     const maxImageBytes = env.SOURCEBOT_CHAT_ATTACHMENT_MAX_IMAGE_BYTES;
-    const contentLength = Number(req.headers.get('content-length') ?? 0);
-    if (contentLength > maxImageBytes + 1024 * 1024) {
+    const rawContentLength = req.headers.get('content-length');
+    const contentLength = rawContentLength ? Number(rawContentLength) : NaN;
+    if (!Number.isFinite(contentLength)) {
+        return serviceErrorResponse({
+            statusCode: StatusCodes.LENGTH_REQUIRED,
+            errorCode: ErrorCode.INVALID_REQUEST_BODY,
+            message: 'Content-Length is required for attachment uploads.',
+        } satisfies ServiceError);
+    }
+    if (contentLength > maxImageBytes + 1024 * 1024) {
         return serviceErrorResponse({
             statusCode: StatusCodes.REQUEST_TOO_LONG,
             errorCode: ErrorCode.INVALID_REQUEST_BODY,
             message: `Attachment exceeds the ${Math.round(maxImageBytes / (1024 * 1024))}MB limit.`,
         } satisfies ServiceError);

Also applies to: 41-52

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/web/src/app/api/`(server)/ee/chat/attachments/route.ts around lines
18 - 24, The attachment size check in the chat upload route is still bypassable
because it only relies on content-length before calling req.formData() and
file.arrayBuffer() in the route handler. Update the attachment handling flow in
the route’s upload path to enforce an authoritative limit after parsing by
rejecting oversized File objects before buffering their contents, and keep the
existing maxImageBytes check as a secondary guard. Use the existing attachment
route logic and the file size handling around req.formData(),
file.arrayBuffer(), and maxImageBytes to ensure the request is rejected even
when content-length is missing or chunked.

Comment on lines +21 to +24
// Client-side image size cap, used for early rejection before upload. The
// server enforces the authoritative cap via SOURCEBOT_CHAT_ATTACHMENT_MAX_IMAGE_BYTES
// (this mirrors its default).
export const ATTACHMENT_MAX_IMAGE_BYTES = 10 * 1024 * 1024; // 10MB per image

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | 🏗️ Heavy lift

Do not hard-code a client limit that the server exposes as configurable.

ATTACHMENT_MAX_IMAGE_BYTES is fixed at 10 MiB here, but the upload route enforces env.SOURCEBOT_CHAT_ATTACHMENT_MAX_IMAGE_BYTES and the docs advertise that as configurable. Any non-default deployment will get incorrect client behavior: the picker can reject files the server would accept, or allow files that later fail on upload.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/web/src/features/chat/constants.ts` around lines 21 - 24,
ATTACHMENT_MAX_IMAGE_BYTES is hard-coded in constants.ts even though the upload
limit is configurable on the server, so the client can diverge from the
authoritative value. Update the client-side early-rejection logic to source the
max image bytes from the same configurable setting used by the upload route
(env.SOURCEBOT_CHAT_ATTACHMENT_MAX_IMAGE_BYTES) rather than a fixed 10 MiB
constant, and keep the existing chat attachment constants synchronized with the
server-facing limit.

Comment on lines +123 to +129
const hasAttempted = catalogFetchedAt > 0 || lastFailedAt > 0;
if (awaitWhenEmpty && cachedCatalog === null && inFlightFetch && !hasAttempted) {
return Promise.race([
inFlightFetch,
new Promise<ModelsDevCatalog | null>((resolve) =>
setTimeout(() => resolve(cachedCatalog), COLD_START_BLOCK_BUDGET_MS)),
]);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Track the cold-start wait attempt separately.

hasAttempted stays false until the fetch settles. If the fetch outlives COLD_START_BLOCK_BUDGET_MS, each new awaitWhenEmpty caller can still wait another 2.5s, contrary to the “at most one short wait per process” behavior.

Suggested fix
 let cachedCatalog: ModelsDevCatalog | null = null;
 let inFlightFetch: Promise<ModelsDevCatalog | null> | null = null;
+let hasColdStartBlockAttempted = false;
@@
-const hasAttempted = catalogFetchedAt > 0 || lastFailedAt > 0;
-if (awaitWhenEmpty && cachedCatalog === null && inFlightFetch && !hasAttempted) {
+const hasAttempted = catalogFetchedAt > 0 || lastFailedAt > 0;
+if (awaitWhenEmpty && cachedCatalog === null && inFlightFetch && !hasAttempted && !hasColdStartBlockAttempted) {
+    hasColdStartBlockAttempted = true;
     return Promise.race([
         inFlightFetch,
         new Promise<ModelsDevCatalog | null>((resolve) =>
             setTimeout(() => resolve(cachedCatalog), COLD_START_BLOCK_BUDGET_MS)),
     ]);
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const hasAttempted = catalogFetchedAt > 0 || lastFailedAt > 0;
if (awaitWhenEmpty && cachedCatalog === null && inFlightFetch && !hasAttempted) {
return Promise.race([
inFlightFetch,
new Promise<ModelsDevCatalog | null>((resolve) =>
setTimeout(() => resolve(cachedCatalog), COLD_START_BLOCK_BUDGET_MS)),
]);
let cachedCatalog: ModelsDevCatalog | null = null;
let inFlightFetch: Promise<ModelsDevCatalog | null> | null = null;
let hasColdStartBlockAttempted = false;
...
const hasAttempted = catalogFetchedAt > 0 || lastFailedAt > 0;
if (awaitWhenEmpty && cachedCatalog === null && inFlightFetch && !hasAttempted && !hasColdStartBlockAttempted) {
hasColdStartBlockAttempted = true;
return Promise.race([
inFlightFetch,
new Promise<ModelsDevCatalog | null>((resolve) =>
setTimeout(() => resolve(cachedCatalog), COLD_START_BLOCK_BUDGET_MS)),
]);
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/web/src/features/chat/modelsDevCatalog.server.ts` around lines 123 -
129, The cold-start gating in ModelsDevCatalog.server.ts only uses hasAttempted
derived from catalogFetchedAt and lastFailedAt, so it stays false while
inFlightFetch is still pending and allows repeated short waits. Update the
awaitWhenEmpty path to track the cold-start wait attempt separately from fetch
settlement, and use that flag in the condition around Promise.race so only one
process-wide COLD_START_BLOCK_BUDGET_MS wait can occur. Keep the existing
inFlightFetch and cachedCatalog behavior, but mark the short-wait as attempted
as soon as it is started.

Comment on lines +159 to +199
const [attachments, existingLinks] = await Promise.all([
prisma.attachment.findMany({ where: { id: { in: ids }, orgId } }),
prisma.chatAttachment.findMany({ where: { chatId, attachmentId: { in: ids } } }),
]);

const attachmentById = new Map(attachments.map((attachment) => [attachment.id, attachment]));
const alreadyLinkedIds = new Set(existingLinks.map((link) => link.attachmentId));

const idsToCommit: string[] = [];
for (const id of ids) {
// Already linked to this chat (idempotent re-send): nothing to do.
if (alreadyLinkedIds.has(id)) {
continue;
}

const attachment = attachmentById.get(id);
if (
!attachment ||
attachment.uploadedById !== userId ||
attachment.status !== AttachmentStatus.PENDING
) {
return {
statusCode: StatusCodes.BAD_REQUEST,
errorCode: ErrorCode.INVALID_REQUEST_BODY,
message: 'Invalid or unauthorized attachment reference.',
} satisfies ServiceError;
}
idsToCommit.push(id);
}

if (idsToCommit.length > 0) {
await prisma.$transaction([
prisma.chatAttachment.createMany({
data: idsToCommit.map((attachmentId) => ({ chatId, attachmentId })),
skipDuplicates: true,
}),
prisma.attachment.updateMany({
where: { id: { in: idsToCommit } },
data: { status: AttachmentStatus.COMMITTED },
}),
]);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | 🏗️ Heavy lift

Claim pending attachments atomically.

Lines 159-198 validate attachment state before entering the transaction, and Lines 195-198 update by id without re-checking status = PENDING. Two concurrent sends can both attach the same upload to different chats. That breaks the single-commit invariant and can widen who can access the blob.

Suggested direction
-    const [attachments, existingLinks] = await Promise.all([
-        prisma.attachment.findMany({ where: { id: { in: ids }, orgId } }),
-        prisma.chatAttachment.findMany({ where: { chatId, attachmentId: { in: ids } } }),
-    ]);
+    await prisma.$transaction(async (tx) => {
+        const existingLinks = await tx.chatAttachment.findMany({
+            where: { chatId, attachmentId: { in: ids } },
+        });

-    const attachmentById = new Map(attachments.map((attachment) => [attachment.id, attachment]));
-    const alreadyLinkedIds = new Set(existingLinks.map((link) => link.attachmentId));
+        const alreadyLinkedIds = new Set(existingLinks.map((link) => link.attachmentId));
+        const idsToCommit = ids.filter((id) => !alreadyLinkedIds.has(id));

-    const idsToCommit: string[] = [];
-    for (const id of ids) {
-        if (alreadyLinkedIds.has(id)) {
-            continue;
-        }
-        const attachment = attachmentById.get(id);
-        if (
-            !attachment ||
-            attachment.uploadedById !== userId ||
-            attachment.status !== AttachmentStatus.PENDING
-        ) {
-            return {
-                statusCode: StatusCodes.BAD_REQUEST,
-                errorCode: ErrorCode.INVALID_REQUEST_BODY,
-                message: 'Invalid or unauthorized attachment reference.',
-            } satisfies ServiceError;
-        }
-        idsToCommit.push(id);
-    }
+        const claimed = await tx.attachment.updateMany({
+            where: {
+                id: { in: idsToCommit },
+                orgId,
+                uploadedById: userId,
+                status: AttachmentStatus.PENDING,
+            },
+            data: { status: AttachmentStatus.COMMITTED },
+        });
+
+        if (claimed.count !== idsToCommit.length) {
+            throw new Error('Attachment claim conflict');
+        }

-    if (idsToCommit.length > 0) {
-        await prisma.$transaction([
-            prisma.chatAttachment.createMany({
-                data: idsToCommit.map((attachmentId) => ({ chatId, attachmentId })),
-                skipDuplicates: true,
-            }),
-            prisma.attachment.updateMany({
-                where: { id: { in: idsToCommit } },
-                data: { status: AttachmentStatus.COMMITTED },
-            }),
-        ]);
-    }
+        await tx.chatAttachment.createMany({
+            data: idsToCommit.map((attachmentId) => ({ chatId, attachmentId })),
+            skipDuplicates: true,
+        });
+    });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const [attachments, existingLinks] = await Promise.all([
prisma.attachment.findMany({ where: { id: { in: ids }, orgId } }),
prisma.chatAttachment.findMany({ where: { chatId, attachmentId: { in: ids } } }),
]);
const attachmentById = new Map(attachments.map((attachment) => [attachment.id, attachment]));
const alreadyLinkedIds = new Set(existingLinks.map((link) => link.attachmentId));
const idsToCommit: string[] = [];
for (const id of ids) {
// Already linked to this chat (idempotent re-send): nothing to do.
if (alreadyLinkedIds.has(id)) {
continue;
}
const attachment = attachmentById.get(id);
if (
!attachment ||
attachment.uploadedById !== userId ||
attachment.status !== AttachmentStatus.PENDING
) {
return {
statusCode: StatusCodes.BAD_REQUEST,
errorCode: ErrorCode.INVALID_REQUEST_BODY,
message: 'Invalid or unauthorized attachment reference.',
} satisfies ServiceError;
}
idsToCommit.push(id);
}
if (idsToCommit.length > 0) {
await prisma.$transaction([
prisma.chatAttachment.createMany({
data: idsToCommit.map((attachmentId) => ({ chatId, attachmentId })),
skipDuplicates: true,
}),
prisma.attachment.updateMany({
where: { id: { in: idsToCommit } },
data: { status: AttachmentStatus.COMMITTED },
}),
]);
await prisma.$transaction(async (tx) => {
const existingLinks = await tx.chatAttachment.findMany({
where: { chatId, attachmentId: { in: ids } },
});
const alreadyLinkedIds = new Set(existingLinks.map((link) => link.attachmentId));
const idsToCommit = ids.filter((id) => !alreadyLinkedIds.has(id));
const claimed = await tx.attachment.updateMany({
where: {
id: { in: idsToCommit },
orgId,
uploadedById: userId,
status: AttachmentStatus.PENDING,
},
data: { status: AttachmentStatus.COMMITTED },
});
if (claimed.count !== idsToCommit.length) {
throw new Error('Attachment claim conflict');
}
await tx.chatAttachment.createMany({
data: idsToCommit.map((attachmentId) => ({ chatId, attachmentId })),
skipDuplicates: true,
});
});
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/web/src/features/chat/utils.server.ts` around lines 159 - 199, The
attachment claim flow in the chat utility is validating and then committing
outside a single atomic check, so two concurrent sends can both attach the same
upload. Move the PENDING/ownership check into the commit path in the
`createMany`/`updateMany` transaction inside the chat attachment helper in
`utils.server.ts`, and ensure the `attachment` update only succeeds when the row
is still `AttachmentStatus.PENDING` and belongs to the expected
`userId`/`orgId`. If the conditional update affects fewer rows than
`idsToCommit`, treat it as an invalid request and do not create any
`chatAttachment` links.

Comment on lines +222 to +242
const remainingLinks = await prisma.chatAttachment.findMany({
where: { attachmentId: { in: attachmentIds } },
select: { attachmentId: true },
});
const stillLinked = new Set(remainingLinks.map((link) => link.attachmentId));
const orphanedIds = attachmentIds.filter((id) => !stillLinked.has(id));

if (orphanedIds.length === 0) {
return;
}

const orphans = await prisma.attachment.findMany({
where: { id: { in: orphanedIds } },
select: { id: true, storageKey: true },
});

const storage = getStorageBackend();
await Promise.all(orphans.map((orphan) =>
storage.delete(orphan.storageKey).catch(() => { /* best effort */ })));

await prisma.attachment.deleteMany({ where: { id: { in: orphanedIds } } });

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | 🏗️ Heavy lift

Re-check orphan status in the delete itself.

Lines 222-242 compute orphanedIds first and then delete by bare id. A concurrent relink can create a fresh ChatAttachment after the findMany, and Line 242 will still delete the attachment row, cascading away that new link.

Suggested direction
-    const orphans = await prisma.attachment.findMany({
-        where: { id: { in: orphanedIds } },
-        select: { id: true, storageKey: true },
-    });
-
-    const storage = getStorageBackend();
-    await Promise.all(orphans.map((orphan) =>
-        storage.delete(orphan.storageKey).catch(() => { /* best effort */ })));
-
-    await prisma.attachment.deleteMany({ where: { id: { in: orphanedIds } } });
+    const orphans = await prisma.$transaction(async (tx) => {
+        const rows = await tx.attachment.findMany({
+            where: {
+                id: { in: orphanedIds },
+                chats: { none: {} },
+            },
+            select: { id: true, storageKey: true },
+        });
+
+        await tx.attachment.deleteMany({
+            where: {
+                id: { in: rows.map((row) => row.id) },
+                chats: { none: {} },
+            },
+        });
+
+        return rows;
+    });
+
+    const storage = getStorageBackend();
+    await Promise.all(
+        orphans.map((orphan) =>
+            storage.delete(orphan.storageKey).catch(() => { /* best effort */ }))
+    );
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const remainingLinks = await prisma.chatAttachment.findMany({
where: { attachmentId: { in: attachmentIds } },
select: { attachmentId: true },
});
const stillLinked = new Set(remainingLinks.map((link) => link.attachmentId));
const orphanedIds = attachmentIds.filter((id) => !stillLinked.has(id));
if (orphanedIds.length === 0) {
return;
}
const orphans = await prisma.attachment.findMany({
where: { id: { in: orphanedIds } },
select: { id: true, storageKey: true },
});
const storage = getStorageBackend();
await Promise.all(orphans.map((orphan) =>
storage.delete(orphan.storageKey).catch(() => { /* best effort */ })));
await prisma.attachment.deleteMany({ where: { id: { in: orphanedIds } } });
const remainingLinks = await prisma.chatAttachment.findMany({
where: { attachmentId: { in: attachmentIds } },
select: { attachmentId: true },
});
const stillLinked = new Set(remainingLinks.map((link) => link.attachmentId));
const orphanedIds = attachmentIds.filter((id) => !stillLinked.has(id));
if (orphanedIds.length === 0) {
return;
}
const orphans = await prisma.$transaction(async (tx) => {
const rows = await tx.attachment.findMany({
where: {
id: { in: orphanedIds },
chats: { none: {} },
},
select: { id: true, storageKey: true },
});
await tx.attachment.deleteMany({
where: {
id: { in: rows.map((row) => row.id) },
chats: { none: {} },
},
});
return rows;
});
const storage = getStorageBackend();
await Promise.all(
orphans.map((orphan) =>
storage.delete(orphan.storageKey).catch(() => { /* best effort */ }))
);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/web/src/features/chat/utils.server.ts` around lines 222 - 242, The
orphan cleanup in the attachment deletion flow needs a final safety check before
removing rows, because a concurrent relink can happen after the initial lookup.
Update the logic in the utility that computes orphanedIds and calls
prisma.attachment.deleteMany so the delete is conditional on the attachment
still having no chatAttachment references at delete time, rather than deleting
by bare id from the earlier snapshot.

Comment on lines +207 to +218
chat_attachment_uploaded: {
source?: string,
mediaType: string,
sizeBytes: number,
},
chat_attachment_degraded: {
chatId: string,
source?: string,
droppedImageCount: number,
modelProvider: string,
model: string,
},

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟡 Minor | ⚡ Quick win

Make source required here or rename these to wa_*.

These events intentionally avoid the wa_ prefix, but the schema still makes source optional. That leaves call sites free to emit indistinguishable cross-surface events, and the new upload route already does so when the header is missing. As per coding guidelines, "The wa_ prefix is reserved for events that can ONLY be fired from the web app. Events fired from multiple sources ... must NOT use the wa_ prefix. Multi-source PostHog events should include a source property to identify the origin."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/web/src/lib/posthogEvents.ts` around lines 207 - 218, The PostHog
event schema for chat attachment events currently leaves source optional, which
allows indistinguishable cross-surface emissions from the upload flow. Update
the type definitions in posthogEvents.ts for chat_attachment_uploaded and
chat_attachment_degraded so source is required, or alternatively rename these
events to use the wa_ prefix if they are truly web-only; make sure the emitting
call sites match the chosen contract, especially the new upload route, and keep
the schema aligned with the intended event origin.

Source: Coding guidelines

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant