Skip to content

feat(native): offline caching#1585

Open
jpnurmi wants to merge 13 commits intomasterfrom
jpnurmi/feat/native-offline-caching
Open

feat(native): offline caching#1585
jpnurmi wants to merge 13 commits intomasterfrom
jpnurmi/feat/native-offline-caching

Conversation

@jpnurmi
Copy link
Collaborator

@jpnurmi jpnurmi commented Mar 19, 2026

Moves caching from startup to the HTTP transport to make it offline caching as it was intended to be (extracted from #1520). The native daemon uses the same HTTP transport, so it gets offline caching for free as long as the cache_keep option is propagated via shared memory.

As for the cache integration tests, it's no longer sufficient to re-run the app to trigger caching, but the tests must now generate failed send attempts.

See also:

@github-actions
Copy link

github-actions bot commented Mar 19, 2026

Messages
📖 Do not forget to update Sentry-docs with your feature once the pull request gets approved.

Generated by 🚫 dangerJS against 2a037d0

@jpnurmi jpnurmi force-pushed the jpnurmi/feat/native-offline-caching branch from 01bb465 to ed3bf6d Compare March 20, 2026 14:05
@jpnurmi jpnurmi changed the base branch from jpnurmi/feat/http-retry to master March 20, 2026 14:08
jpnurmi and others added 2 commits March 20, 2026 15:09
Previously, cache_keep unconditionally copied envelopes to the cache
directory on restart (during process_old_runs), regardless of whether
sending succeeded or failed. Now envelopes are cached only when the
HTTP send actually fails, fixing the behavior to match the intended
"offline caching" use case.

- Add cache_path and refcount to sentry_run_t
- Extract http_send_envelope() helper to get status_code
- Cache on failed send in http_send_task when cache_keep is enabled
- Add cleanup_func on transport for async cache pruning on bgworker
- Remove unconditional caching from process_old_runs
- Update integration tests to use HTTP transport with unreachable DSN

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The native daemon uses the same HTTP transport, so it gets offline
caching for free as long as the cache_keep option is propagated via
shared memory. Also, disable http_retry in the daemon since it is a
single-shot process where retry backoff makes no sense.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jpnurmi jpnurmi force-pushed the jpnurmi/feat/native-offline-caching branch from ed3bf6d to 81ab489 Compare March 20, 2026 14:10
@jpnurmi jpnurmi marked this pull request as ready for review March 20, 2026 14:39
Copy link
Collaborator

@supervacuus supervacuus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving this to the transport sounds sensible (although I initially thought the cache proposal was just a general development cache, where all sent envelopes should accumulate for later local inspection).

The only slightly uncomfortable aspect, if I read this correctly, is that sentry__process_old_runs() now just removes old envelopes after enqueuing them for capture, whereas previously it sync-renamed them into the cache. That guarantee is now gone, since a crash between the removal in sentry_init() and the actual transfer into the cache in http_send_task() would drop the envelope. I mean, I know that we dump enqueued items, but the transport dequeue window is still real.

This can be a fair trade-off, but I think this is a noteworthy aspect of this change.

When shutdown times out, the on_timeout callback cancels the in-flight
HTTP request (via curl progress callback abort or WinHTTP handle
closure), unblocking the worker thread so it can finish remaining
tasks like cache cleanup. Fixes test_cache_max_items failing on
Windows where WinHTTP blocks for seconds on unreachable hosts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the boolean `running` flag with a tri-state `status` field
(STOPPED, RUNNING, STOPPING) so the worker stops after its current
task when a timeout occurs, leaving remaining tasks in the queue for
dump_queue to save to disk.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jpnurmi jpnurmi force-pushed the jpnurmi/feat/native-offline-caching branch from e4bf6ba to b8cdf5d Compare March 24, 2026 21:04
Revert the noisy STOPPED/RUNNING/STOPPING enum in favor of keeping
the original running flag, and adding a separate simple draining
flag. When on_timeout fires, draining is set so the worker stops
after its current task, leaving remaining tasks in the queue for
dump_queue.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Allow NULL callback in sentry__bgworker_foreach_matching to drop tasks
without invoking a callback. Use this to drop the cache cleanup task
when shutdown times out, preventing a refcount underflow caused by
sentry_options_free being called on an already-freeing options object.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

When the worker is stopped early by the draining flag, the cleanup
task stays in the queue unexecuted. Execute it synchronously via
foreach_matching before returning from http_transport_shutdown.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Drain the bgworker queue before triggering a crash to avoid malloc
deadlock between the bgworker and breakpad's minidump writer.

Call graph:
    872 Thread_6504096   DispatchQueue_1: com.apple.main-thread  (serial)
    + 872 start  (in dyld) + 7184  [0x18ca19d54]
    +   872 main  (in sentry_example) + 5652  [0x1002464a8]  example.c:961
    +     872 trigger_crash  (in sentry_example) + 32  [0x10024762c]  example.c:373
    +       872 _platform_memset  (in libsystem_platform.dylib) + 108  [0x18cded11c]
    872 Thread_6504097
    + 872 start_wqthread  (in libsystem_pthread.dylib) + 8  [0x18cddeb9c]
    +   872 _pthread_wqthread  (in libsystem_pthread.dylib) + 368  [0x18cddfe98]
    +     872 __workq_kernreturn  (in libsystem_kernel.dylib) + 8  [0x18cda29dc]
    872 Thread_6504098
    + 872 start_wqthread  (in libsystem_pthread.dylib) + 0  [0x18cddeb94]
    872 Thread_6504099: sentry-http
    + 872 thread_start  (in libsystem_pthread.dylib) + 8  [0x18cddeba8]
    +   872 _pthread_start  (in libsystem_pthread.dylib) + 136  [0x18cde3c08]
    +     872 worker_thread  (in libsentry.dylib) + 756  [0x1002e863c]  sentry_sync.c:295
    +       872 sentry__cleanup_cache  (in libsentry.dylib) + 128  [0x1002deb18]  sentry_database.c:392
    +         872 sentry__pathiter_next  (in libsentry.dylib) + 112  [0x1002f3f6c]  sentry_path_unix.c:403
    +           872 sentry__path_join_str  (in libsentry.dylib) + 420  [0x1002f3cbc]  sentry_path_unix.c:282
    +             872 sentry__path_from_str_owned  (in libsentry.dylib) + 24  [0x1002f26b8]  sentry_path.c:24
    +               872 mfm_alloc  (in libsystem_malloc.dylib) + 416  [0x18cbf7f90]
    872 Thread_6504100
      872 thread_start  (in libsystem_pthread.dylib) + 8  [0x18cddeba8]
        872 _pthread_start  (in libsystem_pthread.dylib) + 136  [0x18cde3c08]
          872 google_breakpad::ExceptionHandler::WaitForMessage(void*)  (in libsentry.dylib) + 264  [0x100301304]  exception_handler.cc:606
            872 google_breakpad::ExceptionHandler::WriteMinidumpWithException(int, int, int, __darwin_ucontext*, unsigned int, bool, bool)  (in libsentry.dylib) + 452  [0x100301190]  exception_handler.cc:434
              872 google_breakpad::MinidumpGenerator::Write(char const*)  (in libsentry.dylib) + 332  [0x1002fdad0]  minidump_generator.cc:326
                872 google_breakpad::MinidumpGenerator::WriteModuleListStream(MDRawDirectory*)  (in libsentry.dylib) + 280  [0x1002fe474]  minidump_generator.cc:1574
                  872 google_breakpad::MinidumpGenerator::WriteModuleStream(unsigned int, MDRawModule*)  (in libsentry.dylib) + 360  [0x1002ff580]  minidump_generator.cc:1451
                    872 google_breakpad::MinidumpGenerator::WriteCVRecord(MDRawModule*, int, char const*, bool)  (in libsentry.dylib) + 360  [0x1002ff8dc]  minidump_generator.cc:1528
                      872 google_breakpad::mach_o::FileID::MachoIdentifier(int, int, unsigned char*)  (in libsentry.dylib) + 88  [0x1002f81ec]  file_id.cc:69
                        872 operator new(unsigned long)  (in libc++abi.dylib) + 52  [0x18cd9ba78]
                          872 mfm_alloc  (in libsystem_malloc.dylib) + 108  [0x18cbf7e5c]
                            872 _os_unfair_lock_lock_slow  (in libsystem_platform.dylib) + 176  [0x18cdeb3e4]
                              872 __ulock_wait2  (in libsystem_kernel.dylib) + 8  [0x18cdaec74]

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants