[cdac] Cache LayoutSet resolution in generated IData constructors#129675
[cdac] Cache LayoutSet resolution in generated IData constructors#129675max-charlamb wants to merge 3 commits into
Conversation
…ntract The cdac source generator emits an `IData<T>` constructor for every data class that calls `LayoutSet.Resolve(target, _typeNames)` to discover the field layout. `Resolve` allocates a `List<LazyLayout>` plus a backing array on every invocation, even though the result is a function of (Target, _typeNames) — both stable for the lifetime of a target snapshot. On a heap walk that constructs 1M `Data.Array` instances this cost is ~340 MB of throwaway allocations. This change caches the resolved `LayoutSet` per (Target, _typeNames): - Adds a fallback registration version `""` (empty string) to `ContractRegistry.Register`. The registry consults the target's contract descriptor first; if no version is advertised (or no implementation is registered for the advertised version), it falls back to the empty-string default. This lets host-side helpers — caches, adapters, cross-contract aggregators — register implementations the target doesn't need to advertise, while still permitting a future target to take over a contract via descriptor advertisement. - Introduces a new `IGeneratedType` contract emitted by the source generator (alongside `LayoutSet` and `TypeNameResolver`) whose single implementation `GeneratedType_1` caches `LayoutSet.Resolve` results keyed by reference identity on the static `_typeNames` arrays the generator already emits per data class. - Adds a `TargetExtensions.GetCachedLayoutSet` extension method (also generator-emitted) that performs lazy first-use registration of `IGeneratedType` and dispatches the cached lookup. - Switches `Emitter` to emit `_target.GetCachedLayoutSet(_typeNames)` instead of `LayoutSet.Resolve(_target, _typeNames)`. - Changes `LayoutSet` from `readonly struct` to `sealed class` so the cached entry is shared by reference rather than copied on every Dictionary lookup and method dispatch — eliminates JIT defensive copies and saves 8 bytes per fetch. - Updates the two test `ContractRegistry` stubs to honor the same empty-version fallback rule. Benchmark on a 1M small-string heap walk (medians of 10 iterations, standalone HeapWalkBench tool not included in this commit): Variant | ms before | ms after | KB before | KB after CachedIData | 724 | 461 | 364,571 | 43,545 UncachedIData | 338 | 281 | 364,571 | 43,532 RawRead | 174 | 175 | 4,536 | 4,389 CachedIData -36% time, -88% allocations UncachedIData -17% time, -88% allocations RawRead unchanged (control) Combined with the LinearReadCache scope API in dotnet#129345, the UncachedIData variant drops further to 226 ms / 4,389 KB — matching RawRead's allocation profile entirely, as JIT escape analysis is then able to stack-allocate the per-object `Data.Array` instances. All 2,605 cdac unit + DataGenerator tests pass on a clean rebuild. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Tagging subscribers to this area: @steveisok, @tommcdon, @dotnet/dotnet-diag |
There was a problem hiding this comment.
Pull request overview
This PR updates the cdac source generator and contract registry behavior to enable caching of LayoutSet.Resolve(target, _typeNames) results per target, reducing repeated allocations during high-volume IData<T> construction. It does so by introducing a generator-emitted host-internal contract + extension helper and by teaching registries (and test stubs) to fall back to an empty-string “default” registration when a contract version isn’t advertised.
Changes:
- Add a generator-emitted
IGeneratedTypecontract +TargetExtensions.GetCachedLayoutSet(...)helper to cache resolvedLayoutSetinstances keyed by the generated_typeNamesarrays. - Change generated code emission to use
GetCachedLayoutSetinstead of callingLayoutSet.Resolvedirectly; makeLayoutSetasealed classto support reference sharing. - Update
CachingContractRegistryand test registries to support empty-string fallback contract resolution behavior.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/native/managed/cdac/tests/TestInfrastructure/TestPlaceholderTarget.cs | Updates test contract registry to support empty-string fallback resolution. |
| src/native/managed/cdac/tests/DataGenerator/TestTarget.cs | Enhances generator test target’s contract registry to support registrations, caching, and fallback behavior. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader/CachingContractRegistry.cs | Implements empty-string fallback behavior in the production registry’s contract resolution. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Abstractions/ContractRegistry.cs | Documents the empty-string “default” registration and fallback semantics. |
| src/native/managed/cdac/gen/LayoutSetSource.cs | Changes emitted LayoutSet from readonly struct to sealed class. |
| src/native/managed/cdac/gen/GeneratedTypeContractSource.cs | Adds generator source for IGeneratedType, GeneratedType_1, and TargetExtensions.GetCachedLayoutSet. |
| src/native/managed/cdac/gen/Emitter.cs | Switches generated IData<T> codegen to use cached layout sets. |
| src/native/managed/cdac/gen/CdacGenerator.cs | Gates and emits the new helper contract source when not already accessible. |
The name "IGeneratedType" was ambiguous (a contract about generated types vs a contract holding generator-emitted state). "IGeneratedTypeCache" makes the intent — caching the type-lookup work the source generator performs — explicit. Renamed: IGeneratedType -> IGeneratedTypeCache GeneratedType (marker)-> GeneratedTypeCache GeneratedType_1 -> GeneratedTypeCache_1 GeneratedTypeContractSource.cs -> GeneratedTypeCacheContractSource.cs HintName "GeneratedTypeContract.g.cs" -> "GeneratedTypeCacheContract.g.cs" No behavior change. All 2,605 cdac unit + DataGenerator tests pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Caching layouts upon LayoutSet.Resolve is fine as an implementation detail, but I do not see why this needs to be a contract. |
Contracts are the way the target maintains state. Without making a contract here, we'd need a special hook on the Target interface to manage the lifecycle of the cache. The contract is not exposed publicly and is really just an implementation detail of the datagenerator. |
- Restrict empty-string fallback to "no target-declared version" case only. If the target advertises a version but no implementation is registered for it, fail-fast as before — preserves version-skew detection. Applied to both CachingContractRegistry and the TestPlaceholderTarget stub. (Copilot review comment 1.) - Trim verbose doc comments on ContractRegistry.Register, GeneratedTypeCacheContractSource, and supporting types. Drops the unintended thread-safety implication noted by Copilot review and reduces overall comment volume. (Copilot review comment 3.) (Copilot review comment 2 about forwarding Flush in TestPlaceholderTarget was not addressed — the test stub does not exercise Flush regardless.) All 2,605 cdac unit + DataGenerator tests pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
We have a DataCache on the target. Is that not state? |
/// <summary>
/// Holds a snapshot of the target's structured data
/// </summary>
public interface IDataCache
{
/// <summary>
/// Read a value from the target and cache it, or return the currently cached value
/// </summary>
/// <typeparam name="T">The type of data to be read</typeparam>
/// <param name="address">The address in the target where the data resides</param>
/// <returns>The value that has been read from the target</returns>
T GetOrAdd<T>(TargetPointer address) where T : Data.IData<T>;
/// <summary>
/// Return the cached value for the given address, or null if no value is cached
/// </summary>
/// <typeparam name="T">The type of the data to be read</typeparam>
/// <param name="address">The address in the target where the data resides</param>
/// <param name="data">On return, set to the cached data value, or null if the data hasn't been cached yet.</param>
/// <returns>True if a copy of the data is cached, or false otherwise</returns>
bool TryGet<T>(ulong address, [NotNullWhen(true)] out T? data);
/// <summary>
/// Clear all cached data
/// </summary>
void Clear();
}The |
I am very much aware of this. I see no reason why you couldn’t add another cache specifically for layouts. It would not claim to be a contract, a general-purpose caching system, or anything else other than a cache for layouts. We have a IDataCache for IData types, and plan to have a ReadCache for memory. Neither of them needs to be part of a general-purpose caching system, because we know they have different functions and different needs. This is no different. |
| // Target declares a version — require an implementation for it. | ||
| // Do NOT fall back to the default registration in this case: a | ||
| // missing version-specific impl is a real version-skew failure | ||
| // and silently using a default would mask it. | ||
| if (!_creators.TryGetValue((typeof(TContract), version), out creator)) |
| IncrementalValueProvider<(bool EmitLayoutSet, bool EmitTypeNameResolver, bool EmitGeneratedTypeCacheContract)> shouldEmitHelpers = context.CompilationProvider | ||
| .Select(static (compilation, _) => ( | ||
| EmitLayoutSet: !IsTypeAccessible(compilation, LayoutSetSource.FullyQualifiedName), | ||
| EmitTypeNameResolver: !IsTypeAccessible(compilation, TypeNameResolverSource.FullyQualifiedName))); | ||
| EmitTypeNameResolver: !IsTypeAccessible(compilation, TypeNameResolverSource.FullyQualifiedName), | ||
| EmitGeneratedTypeCacheContract: !IsTypeAccessible(compilation, GeneratedTypeCacheContractSource.FullyQualifiedName))); |
| return false; | ||
| } | ||
|
|
||
| resolved = creator(_target); |
| else if (_creators.TryGetValue((typeof(TContract), string.Empty), out var fallback)) | ||
| { | ||
| // No target-declared version — fall back to the empty-string default. | ||
| resolved = fallback(_target); |
I don't want to change the Is there a reason you don't like using an internal contract for this purpose? |
Could this be an implementation detail of LayoutSet instead? |
Contracts are supposed to give us information about a specific area of the runtime. This “contract” is just an implementation detail of the cDAC itself. Yes it is internal, but it is still stretching the intended use. |
Note
This PR description was drafted with assistance from GitHub Copilot CLI; the changes themselves were co-authored interactively with the author.
Summary
Caches the result of
LayoutSet.Resolve(target, _typeNames)so each cdac-generatedIData<T>constructor reuses its per-(target, typeNames) layout rather than rebuilding it. On a 1M-object heap walk this dropsData.Arrayconstruction allocations by ~88% and wall time by 17–36% depending on call site.Why
The cdac source generator emits, for every
IData<T>class, a constructor body like:LayoutSet.Resolveallocates aList<LazyLayout>and a backing array on every call, but the result is a pure function of(Target, _typeNames). Both inputs are stable for the lifetime of a target snapshot —_typeNamesis a static array the generator emits once per data class. So the result can (and should) be cached.How
The cache is stored on a target-agnostic contract —
IGeneratedTypeCache— emitted by the cdac source generator alongsideLayoutSetandTypeNameResolver:GeneratedTypeCache_1is the implementation: aDictionary<string[], LayoutSet>withReferenceEqualityComparer(the generator emits string[] literals once per data class, so reference equality is the correct comparer and gives 100% hit rate after the first instantiation).A
TargetExtensions.GetCachedLayoutSetextension method (also generator-emitted) lazily registersIGeneratedTypeCacheon first use, so generated code can calltarget.GetCachedLayoutSet(_typeNames)without any explicit registration step in the host.The
Emitteris updated to emit_target.GetCachedLayoutSet(_typeNames)in place ofLayoutSet.Resolve(_target, _typeNames).LayoutSetis also changed fromreadonly structtosealed classso cached entries are shared by reference rather than copied on everyDictionarylookup and method dispatch.Empty-version registration special-case
IGeneratedTypeCacheis a host-side contract — it has no target-side data and is not advertised by the target's contract descriptor. To wire it into the existingContractRegistry.Register(version, creator)API without adding a new method, this PR establishes a convention:Register("", creator)— registers an implementation under the empty-string version.TryGetContract, when the target does not advertise a version for the contract, the registry falls back to the empty-string registration if any is present.This lets the source generator hook the cache contract into the existing registry without a new abstract API.
Benchmark
1M small-string heap walk, three implementations of
Object.GetSize'sm_NumComponentsread, medians of 10 iterations (standalone HeapWalkBench harness, not part of this PR):CachedIData(ProcessedData.GetOrAdd<Data.Array>)UncachedIData(new Data.Array(...))RawRead(target.Read<uint>(addr + offset))The
RawReadcontrol variant doesn't go through any IData ctor so it shouldn't move — and doesn't.Stacked with #129345's
LinearReadCachescope API, theUncachedIDatavariant drops further to 226 ms / 4,389 KB — matchingRawRead's allocation profile entirely, because JIT escape analysis is then able to stack-allocate the per-objectData.Arrayinstances (theLinearReadCachemakes the read path a direct call instead of an opaque delegate).Tests
All 2,605 cdac unit + DataGenerator tests pass on a clean rebuild. Both
ContractRegistrytest stubs (TestPlaceholderTarget,DataGenerator/TestTarget) are updated to honor the empty-version fallback rule.Out of scope / follow-ups
IGeneratedTypeCachereference per-Target in a fast-access slot would eliminate the remaining ~14%TryGetContractcost per IData ctor (per profile data). Deferred — broader Target-level implications.LayoutSetmeasured at ~3% additional wall-time win. Deferred — small enough to live in its own PR with its own measurement story.HeapWalkBench+GcStressHeapdebuggee) was developed for this work but is not included; happy to submit separately if useful.