Fix path-traversal vulnerability in emergency P2P checkpoint service by YuvalElbar6 · Pull Request #3105 · google/orbax

YuvalElbar6 · 2026-04-17T05:47:53Z

A malicious or compromised peer on the P2P network could supply a manifest whose rel_path contained '..' segments or an absolute path, causing P2PNode.fetch_shard_from_peer() to write attacker-controlled bytes outside the staging directory (e.g. a .pth file in site-packages, yielding persistent RCE on the training host).

Add _safe_path_join() which joins a peer-supplied relative path onto a base directory only if the resolved result stays inside that base. Resolution goes through os.path.realpath so symlink-escape attempts are caught as well.
Apply the helper on both sides of the wire:
- Client: fetch_shard_from_peer() validates every manifest entry against stage_dir and aborts the whole fetch on any unsafe entry.
- Server: handle_download() replaces the substring '..' check with the same resolve-based containment check against self.directory.
Log every rejection with peer and request context.
Add regression tests for the helper and both call sites.

Reported via the Google OSS VRP.

linxiulei · 2026-05-06T18:09:12Z

Hi, thanks for your contribution!

Since Orbax runs on top of JAX, and JAX's coordination service doesn't enforce strong authentication, the Orbax runtime generally assumes it's operating in a secure network environment. In practice, if there were a "compromised peer," we would already be facing much larger systemic risks.

That said, we always welcome security enhancements. Could you please clean up the empty commits in the PR so we can get it merged?

YuvalElbar6 · 2026-05-06T18:17:54Z

Thanks for the context on the threat model — agreed it's a defense-in-depth hardening rather than a critical fix. I've cleaned up the merge commits; the PR is now a single commit on top of main.
@linxiulei

YuvalElbar6 · 2026-05-07T03:23:22Z

Hi @orbax-dev, @linxiulei this PR is now one commit can we merge it?

orbax-dev · 2026-05-11T19:40:28Z

Hey @YuvalElbar6, we don't have external contributions properly set up, so your commit would just get overwritten next time we sync from internal. Waiting for @linxiulei to send me an internal version of this change, at which point we can merge this PR.

linxiulei

Review comments are informative if you have no objections and they will be in the final merge.

linxiulei · 2026-05-12T15:30:10Z

+    self.enter_context(
+        mock.patch.object(service, '_ThreadingTCPServer', autospec=True)
+    )
+    server = service._ThreadingTCPServer.return_value


mock_server_cls = self.enter_context( mock.patch.object(service, '_ThreadingTCPServer', autospec=True) ) server = mock_server_cls.return_value

linxiulei · 2026-05-12T15:30:55Z

+
+    A malicious peer returns a manifest whose ``rel_path`` tries to escape the
+    staging directory (e.g., writing a ``.pth`` file into site-packages).
+    ``fetch_shard_from_peer`` must abort before any ``download`` call.


Args: mock_download: Mock for download. mock_request: Mock for request. unused_mock_time: Unused mock for time. unused_mock_move: Unused mock for move. unused_mock_rmtree: Unused mock for rmtree.

to be consistent with code style.

linxiulei · 2026-05-12T15:31:08Z

+      mock_request,
+      unused_mock_time,
+      unused_mock_move,
+      unused_mock_rmtree,


YuvalElbar6 · 2026-05-12T16:21:09Z

Yes I have no objection,
When I will get to home I will fix it

A malicious or compromised peer on the P2P network could supply a manifest whose rel_path contained '..' segments or an absolute path, causing P2PNode.fetch_shard_from_peer() to write attacker-controlled bytes outside the staging directory (e.g. a .pth file in site-packages, yielding persistent RCE on the training host). - Add _safe_path_join() which joins a peer-supplied relative path onto a base directory only if the resolved result stays inside that base. Resolution goes through os.path.realpath so symlink-escape attempts are caught as well. - Apply the helper on both sides of the wire: * Client: fetch_shard_from_peer() validates every manifest entry against stage_dir and aborts the whole fetch on any unsafe entry. * Server: handle_download() replaces the substring '..' check with the same resolve-based containment check against self.directory. - Log every rejection with peer and request context. - Add regression tests for the helper and both call sites. Reported via the Google OSS VRP.

YuvalElbar6 · 2026-05-12T17:47:44Z

Hi @linxiulei I fiixed the comments

YuvalElbar6 mentioned this pull request Apr 17, 2026

Path traversal in emergency P2P checkpoint service allows arbitrary file write on training host #3106

Open

YuvalElbar6 force-pushed the main branch from ac4c521 to 734254a Compare May 6, 2026 18:19

YuvalElbar6 force-pushed the main branch 2 times, most recently from 862ab75 to 3a54a83 Compare May 11, 2026 19:22

linxiulei reviewed May 12, 2026

View reviewed changes

YuvalElbar6 force-pushed the main branch from 3a54a83 to c42d917 Compare May 12, 2026 17:43

YuvalElbar6 requested a review from linxiulei May 12, 2026 17:49

copybara-service Bot closed this in 0a8619c May 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix path-traversal vulnerability in emergency P2P checkpoint service#3105

Fix path-traversal vulnerability in emergency P2P checkpoint service#3105
YuvalElbar6 wants to merge 1 commit into
google:mainfrom
YuvalElbar6:main

YuvalElbar6 commented Apr 17, 2026

Uh oh!

linxiulei commented May 6, 2026

Uh oh!

YuvalElbar6 commented May 6, 2026

Uh oh!

YuvalElbar6 commented May 7, 2026

Uh oh!

orbax-dev commented May 11, 2026

Uh oh!

linxiulei left a comment

Uh oh!

linxiulei May 12, 2026

Uh oh!

YuvalElbar6 May 12, 2026

Uh oh!

linxiulei May 12, 2026

Uh oh!

linxiulei May 12, 2026

Uh oh!

YuvalElbar6 commented May 12, 2026

Uh oh!

YuvalElbar6 commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

YuvalElbar6 commented Apr 17, 2026

Uh oh!

linxiulei commented May 6, 2026

Uh oh!

YuvalElbar6 commented May 6, 2026

Uh oh!

YuvalElbar6 commented May 7, 2026

Uh oh!

orbax-dev commented May 11, 2026

Uh oh!

linxiulei left a comment

Choose a reason for hiding this comment

Uh oh!

linxiulei May 12, 2026

Choose a reason for hiding this comment

Uh oh!

YuvalElbar6 May 12, 2026

Choose a reason for hiding this comment

Uh oh!

linxiulei May 12, 2026

Choose a reason for hiding this comment

Uh oh!

linxiulei May 12, 2026

Choose a reason for hiding this comment

Uh oh!

YuvalElbar6 commented May 12, 2026

Uh oh!

YuvalElbar6 commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants