Skip to content

refactor(uffs-mft): introduce Lcn(i64) newtype (Phase 4 sub-phase 5d, refs #191)#258

Merged
githubrobbi merged 2 commits into
mainfrom
refactor/phase-4-5d-lcn-newtype
May 16, 2026
Merged

refactor(uffs-mft): introduce Lcn(i64) newtype (Phase 4 sub-phase 5d, refs #191)#258
githubrobbi merged 2 commits into
mainfrom
refactor/phase-4-5d-lcn-newtype

Conversation

@githubrobbi
Copy link
Copy Markdown
Collaborator

Summary

Phase 4 sub-phase 5d — introduces a typed uffs_mft::platform::Lcn
newtype wrapping the raw i64 cluster identifiers that NTFS hands
back from FSCTL_GET_RETRIEVAL_POINTERS and $DATA mapping-pair
decode. All MFT-extent and data-run consumers now go through the
newtype, so the two distinct sparse conventions stop being
open-coded as raw integer comparisons
at every call site.

What changes

New Lcn newtype (crates/uffs-mft/src/platform/lcn.rs)

  • Copy + Eq + Ord + Hash + Display + Debug so it slots into
    HashMap / BTreeMap keys, comparisons, and tracing fields.
  • Sentinels: Lcn::ZERO (data-run sparse: "no LCN offset emitted")
    and Lcn::HOLE = -1 (retrieval-pointer sparse: LCN_HOLE).
  • Helpers: new, raw, raw_unsigned (documented bit-pattern
    reinterpret), is_hole (any negative — matches the historic
    lcn < 0 guard), is_zero (data-run sparse predicate).

Field migrations

  • MftExtent.lcn: i64Lcnbyte_offset now branches on
    is_hole() and multiplies via raw_unsigned().
  • DataRun.lcn: i64Lcnis_sparse() checks is_zero(),
    byte_offset clamps any hole via is_hole() (defensive against
    corrupt buffers) and multiplies via raw_unsigned().
  • parse_data_runs emits Lcn::ZERO for runs without an encoded
    offset and Lcn::new(current_lcn) otherwise — wire bytes
    unchanged.
  • parse_retrieval_pointers wraps each decoded i64 LCN with
    Lcn::new at the FFI boundary — RETRIEVAL_POINTERS_BUFFER
    byte layout untouched.

Call-site migrations

io/extent_map.rs, io/chunking.rs, platform/upcase.rs,
platform/volume.rs, reader/{dataframe_read, dataframe_timing, index_read, index_timing, persistence, persistence_capture}.rs,
commands/windows/{benchmark_index, benchmark_mft, info}.rs,
ntfs/tests.rs.

Each callsite went from one of these patterns:

  • lcn < 0 / lcn == 0lcn.is_hole() / lcn.is_zero()
  • lcn.cast_unsigned() * bytes_per_clusterlcn.raw_unsigned() * bytes_per_cluster
  • lcn * bpc.cast_signed()lcn.raw() * bpc.cast_signed()
  • lcn: kernel_value.cast_signed()lcn: Lcn::new(kernel_value.cast_signed())
  • tracing lcn = ext.lcnlcn = %ext.lcn so logs render 42
    not Lcn(42).

Wire format unchanged

Pinned by:

  • platform::lcn::tests::raw_roundtrip_preserves_i64_exactly — any
    i64 survives Lcn::new.raw() byte-identically.
  • platform::lcn::tests::raw_unsigned_reinterprets_bit_pattern
    raw_unsigned() matches i64::cast_unsigned() bit-for-bit,
    including the Lcn::new(-1).raw_unsigned() == u64::MAX corner.
  • platform::extents::tests::byte_offset_returns_zero_for_sparse_extents
    / byte_offset_matches_lcn_times_bytes_per_cluster /
    byte_size_independent_of_lcn.
  • ntfs::data_runs::tests::is_sparse_matches_only_zero_sentinel
    / byte_offset_clamps_holes_and_yields_lcn_times_bpc_otherwise /
    parse_data_runs_marks_sparse_runs_with_zero_lcn.

#[repr(C)] Win32 ABI mirrors and on-disk byte layouts are not
touched.

Untouched on purpose

VOLUME_DATA.mft_start_lcn: u64 (from
FSCTL_GET_NTFS_VOLUME_DATA) stays a raw u64 — it's the
unsigned-by-spec starting cluster of $MFT, not an LCN that can
carry a sparse marker. Consumers wrap it into Lcn at the
MftExtent construction boundary
(Lcn::new(mft_start_lcn.cast_signed())), so the type discipline
lives where extents are actually walked.

Verification

  • just lint-pre-push (full local gate including
    lint-ci-windows) — 124s green.
  • cargo xwin check --target x86_64-pc-windows-msvc --workspace --all-targets.
  • cargo test --workspace --lib --bins — 163 lib tests pass
    (was 157 + 6 new Lcn / MftExtent / DataRun regression tests).

Refs #191

…mbers

Replaces ad-hoc `i64` LCN values across MFT extents, NTFS data runs,
and their downstream consumers with a typed `uffs_mft::platform::Lcn`
newtype so the two distinct sparse conventions stop being open-coded
as raw integer comparisons that any caller could mis-interpret.

The newtype is `Copy + Eq + Ord + Hash + Display` and exposes
`Lcn::ZERO`, `Lcn::HOLE`, `Lcn::new`, `Lcn::raw`, `Lcn::raw_unsigned`,
`Lcn::is_hole`, and `Lcn::is_zero` so:

* sparse-extent detection (`extent.lcn < 0`) goes through
  `Lcn::is_hole` everywhere — extent map, chunking, bitmap reader,
  benchmark CLI;
* data-run sparse encoding ("no LCN offset emitted" → running total
  unchanged) keeps the distinct `Lcn::ZERO` / `Lcn::is_zero`
  predicate, so the two NTFS conventions remain surgically separated;
* the unsigned byte-offset arithmetic (`raw_unsigned() *
  bytes_per_cluster`) is a single documented bit-pattern reinterpret
  helper instead of an open-coded `cast_unsigned()` at every site;
* tracing fields use `%`-Display so logs render `42` not `Lcn(42)`.

## Wire format unchanged

Pinned by `platform::extents::tests::byte_offset_*` /
`ntfs::data_runs::tests::parse_data_runs_marks_sparse_runs_with_zero_lcn`
/ `platform::lcn::tests::raw_roundtrip_preserves_i64_exactly`:

* `FSCTL_GET_RETRIEVAL_POINTERS` decode still reads a signed 64-bit
  `LcnPosition` and wraps it via `Lcn::new(read_i64)` — the
  `RETRIEVAL_POINTERS_BUFFER` byte layout is untouched;
* on-disk `$DATA` mapping-pair decode keeps emitting `Lcn::ZERO`
  when `offset_size == 0` (sparse run) and a positive running total
  otherwise — `parse_data_runs` round-trips bit-for-bit;
* `MftExtent::byte_offset` / `DataRun::byte_offset` produce the same
  unsigned offsets as the prior `lcn.cast_unsigned() *
  bytes_per_cluster` arithmetic, including the historic
  `nonneg_to_u64` clamp for hole sentinels;
* `#[repr(C)]` Win32 ABI mirrors are not touched.

## Untouched on purpose

`VOLUME_DATA.mft_start_lcn: u64` (from `FSCTL_GET_NTFS_VOLUME_DATA`)
stays a raw `u64` — it is the unsigned-by-spec starting cluster, not
an LCN that can carry a sparse marker; consumers wrap it into `Lcn`
at the construction boundary (`Lcn::new(mft_start_lcn.cast_signed())`)
so the type discipline lives where MftExtent values are actually
manipulated.

Refs #191
@githubrobbi githubrobbi enabled auto-merge (squash) May 16, 2026 07:37
@githubrobbi githubrobbi merged commit 7f03d17 into main May 16, 2026
26 checks passed
@githubrobbi githubrobbi deleted the refactor/phase-4-5d-lcn-newtype branch May 16, 2026 07:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant