fix(logging): Fix deadlock in log batcher#5684
Conversation
Semver Impact of This PR🟢 Patch (bug fixes) 📋 Changelog PreviewThis is how your changes will appear in the changelog. New Features ✨Anthropic
Pydantic Ai
Other
Bug Fixes 🐛
Documentation 📚
Internal Changes 🔧Anthropic
Docs
Openai Agents
Other
🤖 This preview updates automatically when you update the PR. |
Codecov Results 📊✅ 134 passed | Total: 134 | Pass Rate: 100% | Execution Time: 20.60s All tests are passing successfully. ❌ Patch coverage is 2.56%. Project has 13572 uncovered lines. Files with missing lines (2)
Generated by Codecov Action |
Codecov Results 📊✅ 52 passed | Total: 52 | Pass Rate: 100% | Execution Time: 6.78s All tests are passing successfully. ❌ Patch coverage is 5.56%. Project has 15300 uncovered lines. Files with missing lines (1)
Generated by Codecov Action |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: flush() unconditionally clears re-entry guard flag
- Batcher.flush() now preserves and restores the thread-local re-entry flag instead of unconditionally clearing it, and a regression test covers this case.
Or push these changes by commenting:
@cursor push 4fe16a2160
Preview (4fe16a2160)
diff --git a/sentry_sdk/_batcher.py b/sentry_sdk/_batcher.py
--- a/sentry_sdk/_batcher.py
+++ b/sentry_sdk/_batcher.py
@@ -115,11 +115,12 @@
self._flusher = None
def flush(self) -> None:
+ old_flag = getattr(self._active, "flag", False)
self._active.flag = True
try:
self._flush()
finally:
- self._active.flag = False
+ self._active.flag = old_flag
def _add_to_envelope(self, envelope: "Envelope") -> None:
envelope.add_item(
diff --git a/tests/test_batcher.py b/tests/test_batcher.py
new file mode 100644
--- /dev/null
+++ b/tests/test_batcher.py
@@ -1,0 +1,28 @@
+from sentry_sdk._batcher import Batcher
+
+
+class DummyBatcher(Batcher[int]):
+ def __init__(self) -> None:
+ super().__init__(
+ capture_func=lambda envelope: None,
+ record_lost_func=lambda *args, **kwargs: None,
+ )
+ self.flush_calls = 0
+
+ def _flush(self):
+ self.flush_calls += 1
+ return None
+
+ @staticmethod
+ def _to_transport_format(item):
+ return item
+
+
+def test_flush_restores_existing_reentry_guard():
+ batcher = DummyBatcher()
+ batcher._active.flag = True
+
+ batcher.flush()
+
+ assert batcher._active.flag is True
+ assert batcher.flush_calls == 1This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.
|
|
||
|
|
||
| @minimum_python_37 | ||
| @pytest.mark.timeout(5) |
There was a problem hiding this comment.
This will actually make the test fail after 5s if it gets deadlocked. Tested on master.


Description
In certain scenarios, the SDK's log batcher might cause a deadlock. This happens if it's currently flushing, and during the flush, something emits a log that we try to capture and add to the (locked) batcher.
With this PR, we're adding a re-entry guard to the batcher, preventing it from recursively handling log items during locked code paths like
flush().Issues
Closes #5681
Reminders
tox -e linters.feat:,fix:,ref:,meta:)