Skip to content

Bee node stuck unhealthy due to storage radius mismatch (10 vs neighborhood 9), redistribution never resumes #5428

@ablehobo

Description

@ablehobo

Context

  • Bee version: 2.7.1-rc2
  • Running in Docker
  • Full node (full-node: true)
  • Mainnet enabled
  • Swap + storage incentives enabled
  • Stable RPC backend (HAProxy in front of multiple Gnosis RPC providers)
  • Node has been running continuously for several days

Summary

Node enters a persistent state where:

  • isFullySynced = true
  • hasSufficientFunds = true
  • isFrozen = false

but:

  • isHealthy = false
  • node does not participate in redistribution (lastPlayedRound remains unchanged)

Logs indicate a storage radius discrepancy:

node is unhealthy due to storage radius discrepancy
self_radius=10 network_radius=9

The condition persists indefinitely and does not self-correct over time.


Expected behavior

Once the node is:

  • fully synced
  • well connected
  • sufficiently funded
  • not frozen

…it should eventually:

  • align its storage radius with the neighborhood
  • transition to isHealthy = true
  • resume participation in redistribution rounds

Actual behavior

The node remains stuck in the following state:

  • isFullySynced = true
  • hasSufficientFunds = true
  • isFrozen = false
  • isHealthy = false

Redistribution does not resume:

{
"lastPlayedRound": 283212,
"round": 2995xx
}

State remains unchanged for multiple days.


Steps to reproduce

Not fully deterministic, but observed sequence:

  1. Node operates normally and participates in redistribution
  2. Node experiences RPC instability (temporary backend connectivity issues)
  3. After RPC is stabilized, node resumes syncing and connectivity
  4. Node reaches:
    • isFullySynced = true
    • healthy peer count (~120+)
  5. Node enters persistent state:
    • storageRadius = 10
    • neighborhood peers predominantly at storageRadius = 9
    • isHealthy = false
    • redistribution no longer progresses

Restarting the node:

  • temporarily sets isHealthy = true while isFullySynced = false
  • once fully synced, state reverts to unhealthy

Possible solution

Unknown.

From observation, node does not automatically converge its storageRadius to match neighborhood consensus once mismatch occurs.

Possible areas to investigate:

  • radius convergence logic after sync recovery
  • reserve state recalculation after RPC disruption
  • conditions under which storageRadius is allowed to decrease

Additional diagnostics

/status

{
"storageRadius": 10,
"connectedPeers": 135,
"isReachable": true
}

/reservestate

{
"radius": 11,
"storageRadius": 10
}

/redistributionstate

{
"isHealthy": false,
"isFullySynced": true,
"lastPlayedRound": 283212
}

Peer observation

Nearby peers (via /status/peers) are predominantly:

storageRadius: 9


AI Disclosure

  • This issue contains suggestions and text generated by an LLM.
  • I have reviewed the AI generated content thoroughly.
  • I possess the technical expertise to responsibly review the AI generated content mentioned in this issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions