diff --git a/TrkQual/README.md b/TrkQual/README.md index 17c227e..f08005b 100644 --- a/TrkQual/README.md +++ b/TrkQual/README.md @@ -19,12 +19,7 @@ The [jupyter notebook](TrkQualTrain.ipynb) contains lots of information close to For those who are interested in either (a) retraining the current algorithm (e.g. we have updated reconstruction), or (b) investigating or updating an old model ### General Overview -There are two steps to releasing an updated TrkQual algorithm: - -1. Train the algorithm and save the model as an ONNX file, and -1. Convert the trained model into C++ inference code to copy into Offline - -Each of these will be done in a different environment. +The [jupyter notebook](TrkQualTrain.ipynb) will create an ONNX file. This can be read by the TrackQuality ArtAnalysis module. ### General Setup You will need to create your own fork of the repository: @@ -51,14 +46,14 @@ For training, you need to ssh into a mu2egpvm machine with a port forwarded, and ssh -L XXXX:localhost:XXXX username@mu2egpvmYY.fnal.gov # XXXX is any port number, and YY is the gpvm number cd /path/to/your/work/area/ mu2einit -pyenv rootana 2.0.0 +pyenv ana ``` You can start a jupyter notebook like so: ``` cd MLTrain/TrkQual -jupyter-notebook --no-browser --port=XXXX # XXXX is the same port that you forwarded when you ssh'd in +jupyter lab --no-browser --port=XXXX # XXXX is the same port that you forwarded when you ssh'd in ``` and copy and paste the URL to your browser to open it. @@ -72,54 +67,17 @@ Make any changes that you want to make: * if you want to modify the ANN1 model (e.g. change structure, or activation functioon), then I would copy it into this new cell and call it ANN2 * if you want to try a brandh new model (e.g. a BDT), then you may need to write a new ```save_func``` etc. -Once ready, click "Kernel->Restart & Run All". You will see a bunch of plots, including some comparisons to previous models. Your model will be saved in the model/ directory along with a ```*plots.root``` file containing histograms. - - -### Converting a Model for Use in Offline -There are two things we need to get the model running in Offline: -* a ```.hxx``` file containing code, and -* a ```.dat``` file containing parameters - -For creating inference code, we use a different environment than for training: - -``` -ssh username@mu2egpvmYY.fnal.gov -cd /path/to/your/work/area/ -mu2einit -muse setup EventNtuple -cd MLTrain/TrkQual/ -``` +Once ready, click "Kernel->Restart & Run All". You will see a bunch of plots, including some comparisons to previous models. Your model will be saved as a ```.onnx``` file in the model/ directory along with a ```*plots.root``` file containing histograms. -You can then generate the inference code using TMVA::SOFIE like so: +The ```.onnx``` file can be copied into ArtAnalysis like so: ``` -root -l -b scripts/CreateInference.C\(\"TrkQual_ANN1_v2\"\) +cp model/TrkQual_ANN1_v2.onnx ../ArtAnalysis/TrkDiag/data/ ``` -If you did not change the model, then you should just need to copy the .dat file to Offline. However, we have found that TMVA::SOFIE sometimes changes the node names and so a new .hxx file is made with a new .dat file. If the structure of the ANN truly hasn't changed then, instead of copying the new .hxx file, you can convert the .dat file from the new format to the old format like so: - -``` -python3 scripts/sortdat.py code/TrkQual_ANN1_v2.dat code/TrkQual_ANN1_v2.dat_conv -``` - -(Note: you may need to change the new node names in the ```name_dict``` dictionary. The left-hand strings are the new names, and the right-hand strings are the names we want to convert to) - -You can then copy the converted file to Offline like so: - -``` -cp code/TrkQual_ANN1_v2.dat_conv ../Offline/TrkDiag/data/TrkQual_ANN1_v2.dat -``` - -and make sure that the new .dat file is used in the TrackQuality module. (For example, change EventNtuple/fcl/prolog.fcl) - -If you modified the ANN model, then you need to copy both the .hxx and .dat file - -``` -cp code/TrkQual_ANN2_v1.hxx ../Offline/TrkDiag/inc/ -cp code/TrkQual_ANN2_v1.dat ../Offline/TrkDiag/data/ -``` +and make sure that the new .onnx file is used in the TrackQuality module. (For example, change EventNtuple/fcl/prolog.fcl) -and make sure that the new model is implemented correctly the TrackQuality module. +If you modified the ANN model (e.g. added new variables), then you will need to make sure the new model is implemented correctly the TrackQuality module. If you trained a different model, then you are entering new territory and should discuss with experts how best to implement. Either: * we make the ```TrackQuality``` module model agnostic, or diff --git a/TrkQual/scripts/CreateInference.C b/TrkQual/scripts/CreateInference.C deleted file mode 100644 index 2f5a87f..0000000 --- a/TrkQual/scripts/CreateInference.C +++ /dev/null @@ -1,30 +0,0 @@ -/// \file -/// \ingroup tutorial_tmva -/// \notebook -nodraw -/// This macro parses a .onnx file -/// into RModel object and further generating the .hxx header files for inference. -/// -/// \macro_code -/// \macro_output -/// \author Sanjiban Sengupta -/// modified by A. Edmonds (2025) - -#include "root/TMVA/RModel.hxx" -#include "root/TMVA/RModelParser_ONNX.hxx" -#include - -using namespace TMVA::Experimental; - -//bname is the base name the model saved as -void CreateInference(const char* bname,const char* suffix=""){ - std::string modelname = std::string("model/") + std::string(bname) + std::string(".onnx"); - std::string infername = std::string("code/") + std::string(bname) +std::string(suffix) + std::string(".hxx"); - - SOFIE::RModelParser_ONNX parser; - SOFIE::RModel model = parser.Parse(modelname, true); - - //Generating inference code - model.Generate(); - // write the code in a file (by default Linear_16.hxx and Linear_16.dat - model.OutputGenerated(infername); -} diff --git a/TrkQual/scripts/sortdat.py b/TrkQual/scripts/sortdat.py deleted file mode 100644 index bf2463b..0000000 --- a/TrkQual/scripts/sortdat.py +++ /dev/null @@ -1,74 +0,0 @@ -# -# This script renames the nodes in the .dat file -# so that they are compatible with the original inference code -# -# Original Author: Jason Guo (LBL/UCB) -# Modified for TrkQual: A. Edmonds -# -import sys - -name_dict = { "tensor_sequentialdenseBiasAddReadVariableOp0 7\n" : "tensor_densebias0 7\n", - "tensor_sequentialdense1MatMulReadVariableOp0 49\n" : "tensor_dense1kernel0 49\n", - "tensor_sequentialdense1BiasAddReadVariableOp0 7\n" : "tensor_dense1bias0 7\n", - "tensor_sequentialdense2BiasAddReadVariableOp0 6\n" : "tensor_dense2bias0 6\n", - "tensor_sequentialdense2MatMulReadVariableOp0 42\n" : "tensor_dense2kernel0 42\n", - "tensor_sequentialdense3BiasAddReadVariableOp0 1\n" : "tensor_dense3bias0 1\n", - "tensor_sequentialdenseMatMulReadVariableOp0 49\n" : "tensor_densekernel0 49\n", - "tensor_sequentialdense3MatMulReadVariableOp0 6\n" : "tensor_dense3kernel0 6\n" - } - -order = [ - "tensor_dense3bias0 1\n", - "tensor_dense3kernel0 6\n", - "tensor_dense2bias0 6\n", - "tensor_dense2kernel0 42\n", - "tensor_dense1bias0 7\n", - "tensor_dense1kernel0 49\n", - "tensor_densebias0 7\n", - "tensor_densekernel0 49\n" -] - - -def sort_tensors_with_values(lines): - - tensors = {} - #print(lines) - for line in lines: - if line.startswith('tensor'): - current_label = line - tensors[current_label] = 0 - elif current_label: - tensors[current_label] = line - sorted_lines = [] - for i in range(len(order)): - sorted_lines.append(order[i]) - if i == len(order) - 1: - sorted_lines.append(tensors[order[i]]) - else: - sorted_lines.append(tensors[order[i]]) - #print(sorted_lines) - with open(sys.argv[2], 'w') as file: - file.writelines(sorted_lines) - return sorted_lines - -def rename_nodes(lines): - renamed_lines = [] - for line in lines: - if line in name_dict.keys(): - renamed_lines.append(name_dict[line]) - else: - renamed_lines.append(line) - return renamed_lines - -# Example usage: - -if __name__ == "__main__": - - - with open(sys.argv[1], 'r') as file: - lines = file.readlines() - - renamed_lines = rename_nodes(lines) - print(renamed_lines) - result = sort_tensors_with_values(renamed_lines) - print(result)