Skip to content

Litep2p clean#574

Open
illuzen wants to merge 23 commits into
mainfrom
illuzen/pq-litep2p-clean
Open

Litep2p clean#574
illuzen wants to merge 23 commits into
mainfrom
illuzen/pq-litep2p-clean

Conversation

@illuzen
Copy link
Copy Markdown
Contributor

@illuzen illuzen commented May 30, 2026

  • vendors the litep2p code
  • switches the networking stack over to litep2p exclusively
  • adds post-quantum support to litep2p
  • turns off libp2p

The litep2p code was first copied from the official source, then modified to add dilithium and kyber support, then elliptic curve support was removed.

The idea behind litep2p and our adoption of it is to reduce the attack surface in the networking stack.

The node connects to testnet and syncs without issues because the wire protocol is identical to the libp2p stack, but we should probably run a few more tests to be sure.


Note

High Risk
Replaces the entire Substrate networking stack and changes peer authentication/encryption (Dilithium, pqXX Noise, PQ TLS); interoperability and security regressions need thorough testnet and adversarial validation.

Overview
This PR replaces libp2p with a vendored litep2p stack as the only P2P backend: workspace and lockfile drop libp2p / patched identity-noise crates, add a new client/litep2p crate (patched via [patch.crates-io]), and wire sc-network through Litep2pNetworkBackend while removing the old libp2p swarm/transport modules.

Node identity and handshakes move to post-quantum crypto: Dilithium ML-DSA-87 for local keys and PeerId, pqXX Noise with ML-KEM 768 (clatter), and QUIC/TLS via rustls-post-quantum / aws-lc-rs with libp2p-style certificate extensions. CLI defaults and generate-node-key / inspect-node-key now use litep2p types; --network-backend defaults to Litep2p (libp2p option removed).

Smaller lockfile/pallet dependency trims are secondary to the networking swap.

Reviewed by Cursor Bugbot for commit bc0f44a. Bugbot is set up for automated code reviews on this repo. Configure here.

illuzen added 15 commits May 30, 2026 22:59
Removed:
- qp-wormhole-verifier, sp-keyring from runtime
- codec from sp-consensus-qpow
- qp-poseidon, sp-io from qp-dilithium-crypto
- qp-poseidon from qp-wormhole
- log from pallet-multisig, pallet-zk-tree
- sp-weights from pallet-scheduler
- sp-metadata-ir from pallet-wormhole
- num-traits, sp-arithmetic from pallet-qpow
- qp-poseidon from pallet-mining-rewards
- qp-high-security from pallet-reversible-transfers
- qp-poseidon, qp-rusty-crystals-dilithium from node
- sc-service from sc-consensus-qpow

Note: codec and scale-info are required by frame macros even if not
directly imported - cargo-machete reports false positives for pallets.
@illuzen illuzen requested a review from n13 May 30, 2026 14:35
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 3 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit bc0f44a. Configure here.

Comment thread client/litep2p/src/crypto/dilithium.rs Outdated
Comment thread client/litep2p/src/peer_id.rs
Comment thread client/network-types/src/dilithium.rs Outdated
@n13
Copy link
Copy Markdown
Collaborator

n13 commented Jun 1, 2026

v12 pls audit

@v12-auditor
Copy link
Copy Markdown

v12-auditor Bot commented Jun 1, 2026

Warning

GitHub user @n13 is not connected to an V12 account. Please sign in at the dashboard and connect your GitHub account.

@n13
Copy link
Copy Markdown
Collaborator

n13 commented Jun 1, 2026

Security & correctness review

Focused on the novel post-quantum code (Dilithium identity, pqXX/ML-KEM Noise, QUIC TLS) and the identity/key plumbing — not the bulk-vendored litep2p internals. Overall the Noise identity-binding design is sound (each side signs its static ML-KEM key, verified against the session's authenticated remote static before any transport socket is returned). One critical issue stands out, plus several lower-severity items.


🔴 CRITICAL — generate-node-key derives the node identity seed from the current timestamp

client/cli/src/commands/generate_node_key.rs

fn hash_current_time_to_hex() -> [u8; 32] {
    let timestamp = SystemTime::now().duration_since(UNIX_EPOCH)
        .expect("Time went backwards").as_millis() as u64;
    blake2_256(&timestamp.to_le_bytes())
}
// ...
let mut hashed_timestamp = hash_current_time_to_hex();
let entropy = SensitiveBytes32::from(&mut hashed_timestamp);
let keypair = Keypair::generate(entropy);

The Dilithium keypair is generated deterministically from a 32-byte seed that is just blake2_256(millisecond_timestamp). ML-DSA-87 keygen is deterministic in the seed, so the entire private identity key is determined by the generation instant.

Impact: the effective entropy is log2(plausible generation window in ms) — roughly 2^35 for a one-year window, and far less if the window is known (deployment date, testnet launch, onboarding time). Given a node's public PeerId, an attacker can enumerate candidate timestamps, run blake2_256 + ML-DSA-87 keygen, and recover the private identity key. That allows full impersonation, forged Noise/TLS identity proofs, and MITM — completely defeating the post-quantum identity guarantees regardless of how strong Dilithium/ML-KEM are.

Worse: node_key_params.rs makes authorities/validators provide a pre-generated key file (it errors with NetworkKeyNotFound otherwise), and generate-node-key is the documented way to create it — so the highest-value nodes are funneled straight into the weak generator.

Note the secure path already exists: dilithium::SecretKey::generate() and Keypair::generate() use OsRng (and the node's auto-generation via Secret::File correctly uses it). generate-node-key should do the same:

let keypair = ml_dsa_87::Keypair::generate(SensitiveBytes32::from(&mut OsRng.gen::<[u8; 32]>()));

Any keys already created with the current generate-node-key must be considered compromised and regenerated.


🟠 MEDIUM — "Post-quantum QUIC" authentication is still classical (ECDSA P-256)

client/litep2p/src/crypto/tls/certificate.rs, tls/verifier.rs, tls/mod.rs

  • The X.509 cert keypair is PKCS_ECDSA_P256_SHA256 and verification_schemes() only offers ECDSA/RSA/Ed25519. The TLS 1.3 CertificateVerify (the signature that authenticates the live connection) is therefore classical. Only the key exchange (rustls_post_quantum::provider(), ML-KEM hybrid) and the identity-binding extension (Dilithium over the cert SPKI) are PQ.
  • A quantum adversary that can break P-256 can forge the handshake signature and impersonate/MITM a peer over QUIC despite the Dilithium binding.

Mitigating fact (please confirm intent): QUIC is enabled as a Cargo feature (litep2p default ["quic","websocket"], and sc-network depends with features = ["quic","websocket"]), but the transport is never wired upLitep2pNetworkBackend::configure_transport only calls .with_tcp(...) / .with_websocket(...) and the address parser explicitly rejects non-tcp protocols. So the TLS/ECDSA code is currently compiled-but-dead. Given the PR's stated goal ("reduce the attack surface"), consider either dropping the quic feature entirely, or don't ship it until cert-level auth is PQ. As-is it's attack surface (and a misleading "post-quantum QUIC") with no functional benefit.


🟡 LOW — Dilithium try_from_bytes trusts the stored public key (also flagged by Bugbot)

client/litep2p/src/crypto/dilithium.rs and client/network-types/src/dilithium.rs (both copies)

In the seed + public branch, the public key is parsed from the file and stored without checking it matches the key derived from the seed. sign() derives from the seed; public() returns the stored bytes. A corrupted/mismatched file yields a node that advertises one PeerId but signs with another key → its own handshake signatures fail to verify (self-DoS / confusing failures). Since the seed fully determines the keypair, the stored public key is redundant — simplest fix is to always derive the public key from the seed (and accept seed-only), or verify equality and error on mismatch.


🟡 LOW — Duplicated Dilithium key module (also flagged by Bugbot)

client/network-types/src/dilithium.rs reimplements the same Keypair/PublicKey/SecretKey logic as client/litep2p/src/crypto/dilithium.rs. They are already diverging (network-types added Hash/Ord/PartialOrd and a different DecodingError; litep2p uses Error/ParseError), and both carry the unchecked-public-key issue above — exactly the divergent-fix risk duplication creates. sc-network-types should wrap litep2p's dilithium types (it already depends on the crate and defines From conversions) instead of maintaining a parallel implementation.


🟡 LOW — PeerId::try_from_multiaddr is inconsistent between layers (also flagged by Bugbot)

  • client/litep2p/src/peer_id.rs: address.iter().last() — only matches /p2p/ as the last component.
  • client/network-types/src/peer_id.rs: address.iter().find(...) — matches /p2p/ anywhere.

The same multiaddr (e.g. /p2p/<id>/p2p-circuit, or /p2p/<id> followed by anything) resolves to a PeerId in one layer and None in the other, which can break address validation / reserved-peer / public-address handling. Align both on the same rule.


🔵 INFO / nits

  • unsafe self-referential RNG in noise/protocol.rs (ClatterSession::ensure_handshake_initialized casts Box<StdRng> to &'static mut). It works in practice (stable heap address, field unused afterward) but is a soundness smell — two live &mut paths to the RNG exist. Prefer restructuring (e.g. owned-RNG handshake API or Pin/self-referential helper).
  • Handshake security depends on clatter noise_pqxx() (v1.1.0) providing mutual static-key authentication. The identity binding is only as sound as that pattern. Worth pinning + noting the audit status of clatter/ml-kem for a consensus-critical path. (The client/litep2p crate is not under the workspace deny(expect_used/panic) lints, so the many .expect() calls on handshake/identity paths won't be caught — worth auditing those that touch remote-controlled input.)
  • Silent Ed25519 stubs: PeerId::into_ed25519/from_ed25519 return None unconditionally (for sc-mixnet compile compat). If mixnet is ever enabled this silently misbehaves rather than failing loudly — fine while mixnet is off, but a latent "silent failure".

I'm happy to open targeted fix PRs — the critical generate-node-key fix is a one-liner and should land regardless of the rest.

Manual review by Cursor agent (Claude). Not a substitute for the formal audit.

@n13
Copy link
Copy Markdown
Collaborator

n13 commented Jun 1, 2026

Review complete and posted to the PR: #574 (comment)

Summary

I reviewed the 186-file PR by fetching the branch into a throwaway worktree (your working tree was untouched) and focusing on the novel security-critical code — Dilithium identity, pqXX/ML-KEM Noise, QUIC TLS, and the key/identity plumbing — rather than the bulk-vendored litep2p internals.

The headline finding is a critical key-generation flaw that the Bugbot pass missed:

  • 🔴 generate-node-key seeds the Dilithium identity from blake2_256(millisecond_timestamp) (client/cli/src/commands/generate_node_key.rs). ML-DSA-87 keygen is deterministic in the seed, so the private identity key has only ~timestamp-worth of entropy (~2^35 for a 1-year window, far less if the time is known). Given the public PeerId, an attacker can brute-force the timestamp and recover the private key → full impersonation/MITM, defeating the PQ guarantees. Worse, validators are funneled into this exact tool (authorities must pre-generate a key file). The secure OsRng path already exists and is used by the node's auto-generation — the CLI just needs to use it. Any keys made with it should be regenerated.

Plus, with severity calibrated against actual reachability:

  • 🟠 Medium — "Post-quantum QUIC" is only PQ for key-exchange + identity binding; the TLS cert/CertificateVerify is classical ECDSA P-256. It's currently dead code (QUIC is compiled in but never wired into the transport — only TCP/WS are), so I framed it as latent and recommended dropping the quic feature to actually reduce attack surface as the PR intends.
  • 🟡 Low ×3 — validated and contextualized Bugbot's three findings: unchecked stored public key in try_from_bytes, the duplicated Dilithium module (already diverging), and the try_from_multiaddr last() vs find() inconsistency.
  • 🔵 Infounsafe self-referential RNG in the Noise wrapper, the dependence on clatter's pqXX for mutual auth (and that client/litep2p is exempt from the workspace deny(expect_used/panic) lints), and the silent Ed25519 None stubs.

I confirmed the good parts too: the Noise handshake verifies the remote's Dilithium signature over its authenticated static ML-KEM key before returning a transport socket, secret files are written 0o600, and the node's automatic key generation correctly uses OsRng.

One note in line with your "don't comment on GitHub automatically" rule — I only posted because you explicitly asked. I did not push any code or open fix PRs; happy to do the one-line generate-node-key fix (and the others) if you'd like.

illuzen added 8 commits June 1, 2026 11:28
Addresses audit finding: 'Stored public key unchecked'

Previously, Keypair stored both seed and public key bytes. When loading
from file, a corrupted public key could cause the advertised PeerId to
differ from the key actually used for signatures and Noise handshakes.

Now Keypair stores only the 32-byte seed. The public key is derived
deterministically on demand via derive_internal(). This eliminates the
possibility of inconsistency between stored and derived public keys.

For backwards compatibility, try_from_bytes() still accepts the old
format (seed + public key) but ignores the public key bytes and
regenerates from seed.

to_bytes() now returns only the 32-byte seed, reducing stored key size
from 2624 bytes to 32 bytes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants