fix(perf): TCP connection pool sized for global chunk-worker count by VirusAlex · Pull Request #66 · VirusAlex/NetCopy

VirusAlex · 2026-05-01T08:31:21Z

Bug

The puller can run up to fileParallelism × chunksPerFile chunk workers concurrently — each file has its own chunk-semaphore but all files share the same BlobPuller's connection pool. Pre-v0.4.1 the pool was sized at just chunksPerFile (default 8), so with the default fileParallelism=4 we had 8 × 4 = 32 chunk workers competing for 8 sockets.

Reported by VirusAlex on a v0.4.0 live transfer:

Pool acquire wait p50: 281 ms · p95: 1.6 s · max: 2.3 s

A quarter of every chunk's wall clock was spent blocked on pool.acquire() instead of doing useful work.

Fix

- yield new TcpBlobPuller(host, job.peerTcpPort(), peerToken,
-     Math.max(1, job.chunksPerFile()), bytesObserver);
+ int poolSize = Math.max(1, job.chunksPerFile())
+              * Math.max(1, job.fileParallelism());
+ yield new TcpBlobPuller(host, job.peerTcpPort(), peerToken,
+     poolSize, bytesObserver);

Gives a 1:1 socket-per-worker ratio so acquire() never blocks under steady-state load. The TCP server's MAX_CONCURRENT_CONNECTIONS=1024 cap is well above any sensible product (default 32; even pathological configs like 16×16=256 stay under).

HTTP path is unaffected — java.net.http.HttpClient manages its own connection pool internally on virtual threads, so it never had this contention.

Test plan

Local mvn compile clean.
Local mvn test -Dtest=ArchitectureTest 8/8 pass.
CI green.
Manual: re-run the same transfer that showed Pool acquire wait p50 281ms on v0.4.0; expect that stat to drop to <10 ms.

🤖 Generated with Claude Code

The puller can run up to fileParallelism × chunksPerFile chunk workers concurrently — each file has its own chunk semaphore but all files share the BlobPuller's connection pool. Pre-v0.4.1 the pool was sized to just chunksPerFile (default 8), so with the default fileParallelism=4 we had 8×4=32 chunk workers competing for 8 sockets. Symptom: the Performance modal's "Pool acquire wait" stat sat at p50 ~280 ms / p95 1.6 s — a quarter of every chunk's wall clock spent waiting for a free socket from a starved pool, not doing useful work. Reported by VirusAlex on a v0.4.0 live run. Fix: poolSize = chunksPerFile × fileParallelism. Gives a 1:1 socket-per- worker ratio so acquire() never blocks under steady-state load. The TCP server's MAX_CONCURRENT_CONNECTIONS=1024 cap is well above any sensible product (default 32; even pathological configs like 16×16=256 stay under). HTTP path is unaffected — java.net.http.HttpClient manages its own connection pool internally on virtual threads, so it never had this contention. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

VirusAlex merged commit 3956ceb into main May 1, 2026
1 check passed

VirusAlex deleted the fix/tcp-pool-size branch May 1, 2026 08:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(perf): TCP connection pool sized for global chunk-worker count#66

fix(perf): TCP connection pool sized for global chunk-worker count#66
VirusAlex merged 1 commit intomainfrom
fix/tcp-pool-size

VirusAlex commented May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

VirusAlex commented May 1, 2026

Bug

Fix

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants