Skip to content

VDR v1 / DIDMan v1: guard against silent subject desync when did:web is enabled #4298

@reinkrul

Description

@reinkrul

Context

Follow-up from review of #4265 (v5→v6 migration guide). Three review threads from @stevenvegt on the "Mixing VDR v1 and v2 APIs" section converge on one product question: the v1 footgun is currently handled with documentation only, and the docs themselves are ambiguous.

  • r3272036014 — wording is ambiguous: reads as "v1 is fine, just don't mix v1+v2", while the deprecation note implies "don't use v1 at all".
  • r3272042569 — "If this is a footgun, shouldn't we make either one available by a config param?"
  • r3272058944 — "should we disable the didman api's or error/warn when did:web is enabled?"

Problem

When a subject owns more than one DID document — i.e. did:web is enabled alongside did:nuts — VDR v1 / DIDMan v1 writes only touch the did:nuts document. They never propagate to the did:web in the same subject. The two silently desync, and the startup migrations do not repair this on later restarts. Reads compound it: v1 and v2 resolve from different views, so a v1 caller and a v2 caller see different state for the same subject.

Nothing in the node prevents this today. Critically, didmethods is not a guard: it selects which DID methods the node creates and manages per subject — it does not enable or disable the v1/v2 API surface or any specific endpoint. The v1 and DIDMan v1 APIs remain fully callable no matter what didmethods is set to. So an operator on the v6 default (["web", "nuts"]) who keeps scripting against v1 silently desyncs every subject, with no error and no config knob to stop it.

Subject composition v1 / DIDMan v1 write Result
did:nuts only (didmethods=["nuts"]) touches the only doc consistent — v1 use is safe
did:nuts + did:web (default) touches did:nuts only did:web silently desyncs from subject; not repaired on restart

Proposal

Make the node enforce the constraint instead of documenting it. When more than one DID method is active for subjects (web enabled), refuse or warn on v1/DIDMan v1 write operations rather than letting them silently desync the subject.

Options to decide between:

  1. Hard-disable the v1 VDR + DIDMan v1 write routes when web is enabled — return 404/501 or don't register them. Strongest guarantee, but breaks any operator still scripting against v1 during a transition.
  2. Fail writes loud — keep routes registered, but have v1/DIDMan v1 write endpoints return an error (e.g. 409/412) when web is enabled, while reads stay available. Lets /internal/vdr/v1/did/conflicted and other read paths keep working during migration.
  3. Warn only — log a warn on each v1/DIDMan write when web is enabled. Lowest friction, weakest protection.

Position: option 2 (fail writes loud, keep reads). Confirmed safe for the migration flow — conflict resolution only reads via v1 (did:nuts state written through v2 lands on the gRPC network, which the v1 read endpoints then surface), so blocking v1 writes doesn't break it. Open for discussion.

Whatever is chosen, also fix the migration-guide wording (r3272036014) so it is unambiguous: v1/DIDMan v1 is deprecated; it is only safe in a single-method (did:nuts-only) deployment; with did:web enabled it must not be used for writes.

Scope

  • vdr/api/v1/api.goWrapper.Routes / write handlers (CreateDID, AddNewVerificationMethod, DeleteVerificationMethod, etc.); gate on whether more than one method is active.
  • didman/api/v1/api.goWrapper.Routes / write handlers; same gating.
  • vdr/vdr.go:66,139supportedDIDMethods is the source of truth for which methods are active (slices.Contains(r.supportedDIDMethods, "web"), cf. vdr/vdr.go:453); expose a check the API wrappers can consume.
  • core/server_config.go:62DIDMethods (koanf:"didmethods"). Note this currently does not gate any API; a dedicated opt-out (e.g. vdr.enablev1api) could be added if keying off method count is too implicit.
  • docs/pages/deployment/migration.rst — fix the v1 deprecation/footgun wording.
  • Tests + release notes.

Considerations

  • v1 reads stay valid even with web enabled: a v2 write to the did:nuts document propagates over the gRPC network and is then served by the v1 read endpoints. So only v1 writes desync subjects — reads (GET /internal/vdr/v1/did/conflicted, GET /status/diagnostics) are safe to keep. This is what makes option 2 viable.
  • Gate implicitly on "is web active" vs. an explicit vdr.enablev1api toggle — implicit needs no new config but couples two concerns; explicit is clearer but adds surface.
  • v1 / DIDMan v1 are already slated for removal in a future major — is a guard worth building, or just sharpen the docs and accept the footgun until removal?

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions