Run the TrackQuality ANN directly from the onnx file with ONNX Runtime#4
Run the TrackQuality ANN directly from the onnx file with ONNX Runtime#4AndrewEdmonds11 wants to merge 7 commits into
Conversation
|
Took a look through this draft. Validation numbers match SOFIE to 4 decimals — behavior looks correct. A few structural items for when you come back to clean it up: Blockers
Bugs / risks
Nits
|
Interface contract for the Mu2e Offline art::EDProducer that will consume the exported CaloClusterNet .onnx model, following the pattern in Andy Edmonds's Mu2e/ArtAnalysis#4. Covers: model artifact metadata (opset 17, regenerate/validate commands); exact input/output tensor names, shapes, dtypes, dynamic axes; the six node and eight edge z-score stats as literal mean/std values from the train split so the C++ caller doesn't need to parse a PyTorch .pt blob for six floats; upstream graph construction (one graph per disk, r_max=210mm, dt_max=25ns, kNN fallback, degree cap); and the full CCN+BFS10 cluster-assembly recipe as pseudocode with every hyperparameter frozen (tau_edge=0.20, bfs_expand_cut=10 MeV, min_hits=2, min_energy=10 MeV). Also embeds the 15c parity proof (max logit diff 9.06e-06, zero threshold flips on 166K val edges, 12.5x CPU speedup vs PyTorch) so the argument for trusting the deployment is in one place, and a list of open items Sophie and Andy need to decide at the integration meeting: central onnxruntime muse install status, module boundary between graph construction and inference, whether to reuse Offline's Calorimeter::neighbors/nextNeighbors vs porting the cKDTree builder, normalisation-stats sidecar format for C++, and a model versioning policy to catch silent tensor-layout drift. Completes 15d in docs/plan.md. Milestone J is now 4/5; the remaining gate is externally blocked on the integration meeting itself.
|
This PR is now ready for merging. Responses to the Copilot review: Blockers:
Done in 95fe557 Bugs / risks:
Function removed in 0b918c0
Comment added in 8028fb4
Did not do
Did not do. I disagree that the latter reads more naturally than an explicit if statement Nits:
Did not do
Did not do
Added the TODO in ac29a48 but I'm not sure we can guarantee a set number of tracks in each event |
This PR contains a draft of the code needed to run the TrkQual ANN directly from the onnx file. This will remove TMVA::SOFIE from the workflow. This is a draft while I work out final details and remove the old code. However, it is validated:
where
outputis the original output andORTis value with the new ONNXRuntime code