Replace Delegate MethodInfo cache with the MethodDesc#99200
Replace Delegate MethodInfo cache with the MethodDesc#99200MichalPetryka wants to merge 65 commits into
Conversation
|
I'm not sure what's up with the failures here, tests that are failing on the CI seem to pass on my machine. |
Co-authored-by: Jan Kotas <jkotas@microsoft.com>
|
/azp run runtime-coreclr gcstress0x3-gcstress0xc |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
Could you please collect some perf numbers to give us an idea about the improvements and regressions in the affected areas? We may want to do some optimizations to mitigate the regressions. |
The existing code tries to compare MethodInfos as a cheap fast path. Most delegates do not have cached MethodInfo, so this fast path is hit rarely - but it is very cheap, so it is still worth it. This cheap fast path is not cheap anymore with this change. It may be best to delete the fast path that is trying to compare the MethodInfos and potentially optimize |
I think the thing that'd need benchmarking here are equality checks and maybe the impact of collectible delegates being stored in the CWT on the GC, the rest of things shouldn't be performance sensitive enough to matter I think? I'm not fully sure what'd be the proper way for benchmarking the latter.
I am going to benchmark the impact of the equality change tomorrow, I feel like if the impact won't be big, potential optimizations to it can be done later. |
Write a small program that loads an assembly as collectible and calls a method in it. The method in collectible assembly can create delegates in a loop. (If you would like to do it under benchmarknet, it works too - but it is probably more work.) |
The optimization has several degrees: never calling the API is the best - delegate targets are not considered reflection targets. Calling the API on something typed as Delegate or MulticastDelegate disables the optimization completely - all delegate targets are reflection targets. But calling the API on something typed as a concrete delegate type (e.g. |
|
Results after latest cleanup:
@jkotas Seems like now it improves basically everything and has no CWT. Can you review again then? Sidenote: I have no idea why everything on main seems to have regressed from .NET 10. Did something cause BDN to measure differently now? |
| { | ||
| [ClassInterface(ClassInterfaceType.None)] | ||
| [ComVisible(true)] | ||
| [NonVersionable] |
There was a problem hiding this comment.
What do you expect this attribute to do?
The runtime has tight coupling with number of core types. We do not use NonVersionable attribute for that.
There was a problem hiding this comment.
Since R2R already hardcodes parts of the layout, my thought was that it wouldn't hurt to do it fully and make it more apparent here.
| return invocationList; | ||
| } | ||
|
|
||
| internal ReadOnlySpan<MulticastDelegate> GetInvocationsUnchecked() |
There was a problem hiding this comment.
The mix of object, Delegate and MulticastDelegate does not look good. Can we make it look more like NativeAOT?
I think there is an opportunity to get rid of one of the fields by making the implementation to be more like NAOT we have discussed some time back.
There was a problem hiding this comment.
The mix of object, Delegate and MulticastDelegate does not look good. Can we make it look more like NativeAOT?
I'd prefer to do that in a followup but I agree.
I think there is an opportunity to get rid of one of the fields by making the implementation to be more like NAOT we have discussed some time back.
From what I remember, NativeAOT relied on function pointer equality a bit here which made it problematic, I'd leave such investigation for followup too.
There was a problem hiding this comment.
Looking at the current version of change from a distance:
- It folds MethodInfo cache and invocation list fields. It makes the delegate instance smaller. I think smaller delegates are goodness.
- But then it uses the saved space to cache MethodDesc*. I get that having MethodDesc* cached in the delegate makes some operations faster. I am not sure whether burning the space for it is worth it.
- Plus there is some refactoring sprinkled through that is not strictly related.
There was a problem hiding this comment.
Looking at the current version of change from a distance:
- It folds MethodInfo cache and invocation list fields. It makes the delegate instance smaller. I think smaller delegates are goodness.
- But then it uses the saved space to cache MethodDesc*. I get that having MethodDesc* cached in the delegate makes some operations faster. I am not sure whether burning the space for it is worth it.
- Plus there is some refactoring sprinkled through that is not strictly related.
Yeah I guess the timeline of the changes was:
- I started with moving the MethodInfo for FOH
- I had to introduce the MethodDesc to mitigate Equals regressions
- I had to move the fields around to keep the R2R layout, which I combined with general cleanup.
For the MethodDesc part, I've realised that it's going to be necessary to be like this if we want to use the invocation count to store a GC handle/frozen indicator since otherwise we'll conflict with open virtual methods there.
If you want I can split the PR in 2 or 3 parts, either methodinfo + methoddesc and general cleanup or do the methoddesc separately too.
There was a problem hiding this comment.
we'll conflict with open virtual methods
The open virtual case is rare (and generally a lot of pain to deal with). If we need to keep more information around for these rare cases, I think it is fine to lazily allocate an object with the extra information and stash it in _helperObject or _target. We can use the type of the object with the extra information to disambiguate the different rare cases if there is more than one.
There was a problem hiding this comment.
I can split the PR in 2 or 3 parts
If you can break things down into smaller changes that do just one coherent thing that is an obvious improvement, they will move fast. For example, it is fine to submit a PR with the cleanup of wrapper delegate cruft that I have missed.
There was a problem hiding this comment.
I've realised that we can remove methodDesc by reusing incovationList so I did it here. We can thus make instances smaller while keeping most benefits.
I also thought out how we can do FOH support still later while doing this.
Doing this requires touching most of the cleanup places so that complicates the split up here a bit. I'll still try to split off some parts though.
First attempt at making delegate GC fields immutable in CoreCLR so that they can be allocated on the NonGC heap.
I've checked it with a simple app and corerun locally with a delegate from an unloadable ALC and it seemed to not crash, assert nor unload the ALC from under the delegate, however I couldn't actually find any runtime tests that would verify delegates from unloadable ALCs work so the CI coverage might be missing.
One small point of concern is that this might make delegate equality checks slower since they rely on checking the methods in the last "slow path" check, which is however always hit for different delegates AFAIR.
Contributes to #85014.
cc @jkotas