Code is should be fairly self explanatory.
This code processes ntuples that are produced from CMSSW and makes a bunch of histograms.
Actually takes two different inputs: one for HLT and one for Offline
Matches events between the two using Run and Event numbers
Outputs from CMSSW reco: HLT and Offline are inputs to this code.
Almost all "analysis" that we do uses these histograms.
eg: ratios of histograms in different folders can give efficiencies or fake rates as a function of any variable plotted
ROOT file has a bunch of directories which contain histograms
cutflow- event-level cut flow (helpful for debugging)cutflowJets- offline Jet-level cut flow (helpful for debugging)Events- Event level plots*Muons- plots of muons with various selection applied*Elecs- plots of electrons with various selection appliedoffJets*- plots of offline jets with various selection appliedpfJets*- plots of online PF jets with various selection appliedcaloJets*- plots of online Calo jets with various selection appliedoffTracks*- plots of offline tracks with various selection applied*Vtx- plots of the primary vertex quantities
offJets- plots of all reconstructed offline jetsoffJets_matched- plots of the subset of offline jets matched to HLT (PF) jetsoffJets_matchedJet- plots of the HLT (PF) jets that matched the offline jetsoffJets_matched_L-offline_matched jetstruth-matched to light-flavor (_L)offJets_matched_B-offline_matched jetstruth-matched to a b-quark (_B)offJets_matched_C-offline_matched jetstruth-matched to a c-quark (_C)offJets_matchedJet_L- offline_matchedJet (ie the PF jet that matched the offline jet) truth-matched to light-flavor (_L)offJets_mathcedPFcsvTag- offline jets that match a CSV btagged HLT PF jet (check code or offJets_mathcedPFcsvTagJet plots for PF csv cut value used)offJets_mathcedPFcsvTagJet- CSV-btagged PF jets which matches an offline jet (check code or plots for PF csv cut value used)offJets_mathcedPFDeepTag- offline jets that match a DeepCSV btagged HLT PF jet (check code or offJets_mathcedPFcsvTagJet plots for PF csv cut value used)offJetsTight_matchedPFJet- PF jets matching an offline jets btagged with tight DeepCSV working pointoffJetsMedium_matchedPFJet- PF jets matching an offline jets btagged with medium DeepCSV working pointoffJetsLoose_matchedPFJet- PF jets matching an offline jets btagged with loose DeepCSV working pointoffJetsMedDeepCSV_matchedPFJet- offline jets pass medium DeepCSV btag that are matched to a PF jetoffJetsMedDeepCSV_matchedPFDeepCSV- offline jets pass medium DeepCSV btag that are matched to a online DeepCSV btagged PF jetoffJetsMedDeepCSV_matchedPFCSV- offline jets pass medium DeepCSV btag that are matched to a online CSV btagged PF jetpfJets- plots of all reconstructed HLT PF jetspfJets_matched- plots of all reconstructed HLT PF jets matched to an offline jetpfJets_matchedJet- plots of the offline-jet that was matched to the HLT PF jets (Note: pfJets_matchedJet should be the same as offJets_matched same with pfJets_matched and offJets_matchedJets)
offTracks- all offline tracksoffTracks_matched- offline tracks matched to online tracks (in PF jets)offTracks_unmatched- offline tracks that are not matched to online tracks (in PF jets)pfTracks- all tracks in PF jetspfTracks_matched- tracks in PF jet that are matched to an offline trackpfTracks_unmatched- tracks in PF jet that are not matched to an offline trackoffBTags- plots of tracks used as inputs to the btagging (in offline jets)offBTags_matched- subset of tracks used as input to btagging matched to online trackpfBTags- plots of tracks used as inputs to the btagging (in PF jets)pfBTags_unmatched- subset of tracks used as inputs to the btagging (in PF jets) without an offline track match
- histograms directly with in the folder are plots of jet-level quantities.
- sub-folders are plots of objects associated to the jet eg: tracks / btagging inputs
- btagging inputs include: Secodrary Vertex info / tracks (that pass btagging track selection)
- some directories have more or less information depending on the detail string that is passed when the histograms are created (used to cut down on the output file size)
Two packages used
nTupleAnalysis - code not specific to the b-jet trigger studies.
Baseclass used by several different contexts.
TriggerStudies - code specific to the b-jet trigger studies.
This is the code that performs the trigger analysis.
- Has containers that hold object-level data (Data-objects)
- Has classes that define and fill object-level histograms (Histogram-classes)
- Almost all histograms produced come from object-level histograms in
nTupleAnalysiseg:
jetHists/trackHistsect - advantage of using
nTupleAnalysisis can share code in different contexts.
Idea: Build data structures at the object-level that are easier to work with.
jetData.cc , trackData.cc ect.
-
We read in "flat ntuples"
-
ntuples organized "by row" meaning have list of variables, entry for each object
eg: jet_Pt = [pt_jet1, pt_jet2 , ... ] and jet_Eta = [eta_jet1, eta_jet2, ...]
Input grouped by jet-variable not by jet
-
Much more useful and physically motivated to group by jet instead of by variable
ie: can group "by column", where we create "jet-object" that is filled with each jet-variable for given jet
eg: jet1.pt = jet_Pt[1] jet1.eta = jet_Eta[1] ect
-
Whole point of
*Data.ccclasses is to do the "by row" -> "by column" conversion -
Result is a list of objects (simple c++ container classes) that have all variables associated to the given object
-
Downstream physics analysis then uses these objects: make analysis logic and filling histograms cleaner/easier
-
Simple object-level data structures located in corresponding header files. eg: jet class defined in
jetData.h -
Point of
nTupleAnalysisis to hide the details of the input ntuple from youAll the thinking is happening at the object-level not the variable level
straight-forward
nTupleAnalysis take away: No high-level thought happening in the baseclass. Just reading things in, and filling histograms. No complicated selection no fancy physics calculation.
This is where more of the finesse is happening.
- does selection
- does object-matching
- defines and fills histograms
Here is Example run script (input files on lxplus)
BTagAnalyzer TriggerStudies/NtupleAna/scripts/BTagAnalyzer_cfg.py \
--inputAOD /eos/cms/store/group/phys_btag/HLTPhaseII/INPUTS/26_02_20_NewOfflineLikeSequence/Offline/ttbar_PU200.root \
--inputRAW /eos/cms/store/group/phys_btag/HLTPhaseII/INPUTS/26_02_20_NewOfflineLikeSequence/Online/ttbar_TrackingV0_PU200.root \
-o $PWD \
-y phase2 \
--histogramming 1 \
--histFile hists_Phase2_TrackingV0_PU200_NewOff.root \
--isMC \
--doTracks \
--nevents -1This command calls a c++ executable and configures it with a python file and various command line input options.
-
BTagAnalyzer- name of c++ executable (source in /bin ) -
TriggerStudies/NtupleAna/scripts/BTagAnalyzer_cfg.py- configuration file written in python that gets passed to the c++ executable -
other lines are options passed to the python configuration.
--input* paths to the input files
-o where to put the output histogram
--histFile name of output histogram
The executable lives in bin/BTagAnalyzer.cc
Whole point of this code is to read-in the configuration and pass everything to BTagAnlysis class (src/BTagAnalysis.cc) which does the work event loop.
Configuration file scripts/BTagAnalyzer_cfg.py
Reads in the command line options, passes them to the c++ executable BTagAnalyzer
Code with the analysis is src/BTagAnalysis.cc
Does four things
- Creates
eventData - Does the event loop
- Does object matching (and truth labeling)
- Defines and fills histograms
Class in TriggerStudies package that loads the various objects-level offJet / PFJets ect. data (deined
- simple structure that holds data for the whole event
- meta-version of the object-level data classes in the
nTupleAnalysis - configures which object-level data to hold
- most of work forwarded to the
nTupleAnalysisbaseclass eventData::updategets called every event,the calls the functions that do the "by-row" -> "by-column" conversion
- hold vectors of the column-sorted objects (
offJets/pfJetsetc.)
Histograms are defined in the constructor (called once per job) eg:
hOffJets_matched_B = new nTupleAnalysis::jetHists("offJets_matched_B", fs, "", jetDetailString );here:
offJets_matched_B- specifies directory name in the output rootfilejetDetailString- is a string that encodes the level of detail of plots produced
Also called once per job.
- loops over the events and actually does the analysis
All the action happens within this loop. (and subroutines within this loop) Best to look at the code to see whats done.