Interactive seg and stereo docs#1726
Merged
Merged
Conversation
Brings in the SAM2/SAM3-based interactive segmentation feature, the SAM3 text-query workflow, and the desktop interactive stereo mode. Web-girder paths are intentionally untouched for now — web support will come in a follow-up. - New segmentation point-click recipe + EditorMenu wiring; SAM2/SAM3 models loaded via VIAME install configs. - Desktop backend: viame_segmentation_service-backed IPC handlers and matching frontend API for segmentationInitialize/Predict/SetImage/ ClearImage/Shutdown/IsReady, textQuery/refineDetections/ runTextQueryPipeline, and stereoEnable/Disable/SetFrame/GetStatus/ TransferLine/TransferPoints/SetCalibration/IsEnabled, plus disparity ready/error event hooks. - EditAnnotationLayer: track shift-key state and right-click for Point mode, propagate background flag for negative SAM points. - Sidebar / ViewerLoader / Viewer: stereo annotation mode UI, error dialog when seg or text-query model fails to load, dot-only-on-source -frame fix. - useModeManager / EditAnnotationLayer / recipes: keep existing geometry type when current editing mode already matches; right-click in Point creation finalises and deselects.
A track-frame's polygon now expands to a list of polygons each with their own keys, and each polygon supports holes. - Server CSV (de)serializer: emit polygon-key column per polygon, support holes in the geoJSON FeatureCollection; auto_key path to append a new polygon to an existing track frame. - Client recipes / useModeManager: handleAddHole / handleAddPolygon / handleCancelCreation; PolygonLayer emits polygon-clicked. - Hole drawing reuses the polygon edit pipeline (left-click places a hole vertex without exiting creation mode). - Test fixtures cover multi-polygon and polygons-with-holes round-trip.
(cherry picked from commit c2f3cd0)
(cherry picked from commit 3db1995)
(cherry picked from commit b7d4fa3)
(cherry picked from commit f5d015a)
…tton) Strip the SAM3 text-query button, dialog, API, and IPC handlers from the interactive editor, keeping segmentation and stereo intact. The full text-query feature lives on the follow-up branch dev/text-query-annot-button.
Strip the no-transcode NativeVideoAnnotator path that should not be on the segmentation/stereo branch: removes the residual Viewer.vue async-component, nativeVideoPath plumbing, and template branch, plus the stale settings field.
The rebase onto main dropped the opening <template> tag, so vite parsed the root <div> as a custom block and the electron build failed. Restore it.
In continuous Detection mode, each interactive-segmentation point click now finalizes its own detection and immediately starts a fresh one, instead of refining a single detection. Non-continuous mode is unchanged: clicks still accumulate to refine one detection until confirmed. Frame-navigation preview restores are excluded so they don't spuriously create detections.
Capture the completed track ID before newTrackSettingsAfterLogic, which in continuous detection mode spawns a new track and changes selectedTrackId, so stereo annotation-complete events attach to the correct detection. Also re-activate the segmentation recipe when re-editing a finalized Point detection so clicks resume predicting. Ported from viame/master (2cad9aa, 4008a1f).
Add a _removeIfEmpty helper and call it from selectTrack when leaving edit mode, so a detection created but never drawn (e.g. clicked away or right-click deselect) doesn't linger as an empty track. handleEscapeMode now reuses the same helper. Ported from viame/master (4139f78).
handleConfirmRecipe now returns early unless an active segmentation recipe has a pending prediction or was explicitly reset, so the contextmenu event from a right-click that enters Point edit mode no longer immediately deselects/deactivates before any points are placed. Adds a wasReset flag on the recipe to allow finalizing after a reset. Ported from viame/master (c4d149c, 69bc0f8).
Wire the edit layer's finalizeInProgress through a handler callback that handleAddTrackOrDetection invokes, so pressing 'n' or starting a new detection commits a valid in-progress polygon (or discards it) instead of leaving it dangling. Also commit a pending segmentation prediction before the track switch so a reset-on-deselect doesn't leave an empty detection. Ported from viame/master (842a20c, dea9653).
In continuous mode a background (negative) click is a refinement of the current mask, not a new object, so it should no longer commit and start a fresh detection.
The reset button only restored a default-key ('') polygon, so resetting a
detection whose existing polygon was segmentation-keyed removed it
entirely. Capture the pre-existing polygon keys in the snapshot and
restore all original polygon geometry, removing only segmentation-added
polygons.
Ported from viame/master.
ViewerLoader already builds a getFrameTime (frame/fps) and the backend already forwards frame_time to the service, but the recipe ignored it and never set frameTime on the predict request, so interactive segmentation on video datasets couldn't seek to the current frame. Accept getFrameTime in the recipe and include frameTime in the request. Ported from viame/master (23ccb25, c363531).
- Await set_frame (ensureStereoFrame) before transfer in the draw handler,
so drawing on the frame stereo was enabled on no longer stalls in the
backend's 120s deferred-disparity wait. Factor the duplicated set-frame
logic (enable kickoff + frame watcher) into the one helper.
- Use renderer-safe path helpers instead of npath.* (node 'path' is
externalized under contextIsolation -> "npath is not defined").
- Declare the missing stereoCameraFps ref ("stereoCameraFps is not defined").
Port the warped-line fixes from viame/master so the line transferred to the second camera is a normal line-mode-editable annotation: - Preserve the source line's key through the transfer (key: params.key instead of '') and thread it through StereoAnnotationCompleteParams. - Emit head/tail Point markers alongside the LineString so endpoint handles render and can be dragged. - Expand the warped bounds by 10% to match the source side (headtail.ts). - Preserve editing mode when left-clicking onto a camera that already has the selected track (the warped annotation), so it can be adjusted immediately.
In interactive stereo, only the user may modify a line a human authored: - Mark a camera's line human-authored when the user draws/edits it (the stereo warp writes geometry directly and never fires this event, so the event firing always means a human edit). - Warp source -> other only when the other side is absent or still machine-generated. Once the other side has been hand-edited it is frozen; further edits on the first camera no longer overwrite it (and vice versa). - Length keeps tracking the shifting geometry, except when length_method is 'user_set' (a new detection attribute the user can set to lock a length); the stereo update then leaves that length alone while still refreshing range/midpoint. Auto-computed lengths record length_method = 'stereo'.
A near-horizontal/vertical line otherwise produced a razor-thin box. After the usual 10% expansion, grow the shorter side about its center until the longer:shorter ratio is at most 6:1. Applied to both the drawn box (headtail.ts tightBoundsExpanded) and the stereo-warped box (ViewerLoader).
Clicking a detection in a camera that isn't selected used to be ignored (LayerManager Clicked early-returned), so it took one click to switch cameras and another to act on the detection. Now that click switches to the clicked camera and acts on the detection in the same click: left-click selects it, right-click edits it. Select-then-edit keeps the result deterministic, and it bails if a mode (e.g. linking) blocked the switch.
When creating a new detection, the creation cursor is now live on every camera (not just the selected one), and a draw is routed to whichever camera it lands on (switch + materialize the new track there). Works for all creation types. - LayerManager: enable the edit layer in creation mode on non-selected cameras (isCreatingNewDetection); route the drawn shape to the drawn-on camera in the update:geojson handler; suppress the select that rides the click which finalizes a shape, and the first-corner click in the cross-camera branch, so an overlapping detection isn't grabbed instead. - Viewer: don't intercept camera-view mousedown mid-creation (it would preventDefault the rectangle drag); let the draw land and route. Known limitation: a line's 2nd vertex landing inside an existing detection still selects that detection (event-ordering quirk); accepted for now.
applyStereoLine stored the stereo/segmentation-created LineString under an
empty key, so it wasn't recognized by the HeadTail recipe and edit-layer like
hand-drawn lines. Store it under HeadTailLineKey ('HeadTails') instead, matching
hand-drawn head/tail lines.
(The controller-init guard from the source commit is already covered here by the
getViewerFrame() helper, so only the keying fix is ported.)
Replace the single 'Interactive Mode' stereo toggle with two independent Stereo Settings controls: 'Update lengths when modified' (on by default) recomputes the stereo measurement when a linked line is modified, and 'Auto-compute location on other camera' (off by default) warps an annotation to the other camera when it has no detection there yet. The backend service starts whenever either feature is on, and the load-time auto-enable degrades silently on failure.
…tion functionality
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Docs update I forgot to add to #1582