Plotting and Cutflow Software

Introduction

We have some standard programs to make plots and dump cutflows from the ttH ntuples. These exploit TTree::Draw() to let the user draw quantities or count events with appropriate cross sections and corrections applied. The cuts and plots are specified in files using "yaml" syntax (see a very simple introduction here).

Installing the Software

The software comes as part of the ttH multileptonic code, which is unfortunately a rather large code base. Follow the check out procedures given in the CERN wiki page. The scripts are all in the HWWtthCode/other/stack area. Make sure you set up an athena release and RootCore every time you start a new session working with these scripts.

The scripts we're going to use are called do_cutflow.py and dump_plots2.py. They're part of a broader suite of programs based on stack.py, which allows you to do interactive querying and plotting of the MC and data from the command line. Basically do_cutflow.py and dump_plots2.py automate calls to stack.py.

The Input Data

You can use the files in /data/tth/v4_processed/trigpdf. Any output of our ntupler will be compatible with this software (see 13 TeV tth Analysis with MultiLepAnalysisNtupler for instructions to build and run that ntupler). The ntupler expects the MC files to be provided with names of the form "DSID.root", and the data files to be "period*.root". All files should be in the same directory.

The Configuration and Cross Section Files

The cross sections for all processes are specified in HWWtthCode/XsectionInput/Xsection8TeV_tth_bkg_v2.yaml and Xsection8TeV_tth_sig_v2.yaml. These are autogenerated from the corresponding .txt files in the same directory, which are much easier to read. By default the "priority 1" samples are the ones that are used, although in practice we override these choices with some regularity. When you run either the cutflow dumper or the plotter, they will first print out all the MC files they have loaded, so you can keep track of what is being done.

You are unlikely to ever need to edit config_8TeV.yaml, but for completeness, this is the file that maps the process names given in the cross section files to categories of processes (e.g. "single top" or "diboson"). Effectively it groups the DSIDs and configures how they will be displayed in the cutflows and plots (and what colors the histograms will be). The color choices are the result of a carefully negotiated agreement, don't touch them unless really necessary. This file also contains weights for various Alpgen Z samples which you can mostly ignore.

Making Cutflows

The cuts are specified in yaml files, in lists like:

- name: Three leptons, Mll
  cut: trilep_type>0&&passEventCleaning&&(lep_Pt_0>10e3&&lep_Pt_1>20e3&&lep_Pt_2>20e3)&&(top_hfor_type!=4)&&Mll01>12e3&&Mll02>12e3&&(abs(lep_ID_1)==13||lep_isVeryTightLH_1)&&(abs(lep_ID_2)==13||lep_isVeryTightLH_2)
- name: Trigger
  cut: trilep_type>0&&passEventCleaning&&(lep_Pt_0>10e3&&lep_Pt_1>20e3&&lep_Pt_2>20e3)&&(top_hfor_type!=4)&&Mll01>12e3&&Mll02>12e3&&(abs(lep_ID_1)==13||lep_isVeryTightLH_1)&&(abs(lep_ID_2)==13||lep_isVeryTightLH_2)&&(lep_Match_EF_mu24i_tight_0||lep_Match_EF_mu36_tight_0||lep_Match_EF_e24vhi_medium1_0||lep_Match_EF_e60_medium1_0||lep_Match_EF_mu24i_tight_1||lep_Match_EF_mu36_tight_1||lep_Match_EF_e24vhi_medium1_1||lep_Match_EF_e60_medium1_1||lep_Match_EF_mu24i_tight_2||lep_Match_EF_mu36_tight_2||lep_Match_EF_e24vhi_medium1_2||lep_Match_EF_e60_medium1_2)

Each cut is specified as a new list item ( - ) which contains a name (description which will be seen in the cutflow tex table) and the cut definition.

The simplest execution of the code is
DISPLAY="" python do_cutflow.py standard_3l_cutflow.yaml --filedir '/data/tth/v4_processed/trigpdf' --texout mycutflow.tex
This will dump the expected yields for the cuts specified in standard_3l_cutflow.yaml to the output file mycutflow.tex. Since these cuts do not need to have any particular relationship to each other, you can make cutflows, or scan alternative signal regions, or whatever you would like to do.

To get a list of options, run
python do_cutflow.py - --help
(the extra hyphen is needed to tell ROOT not to interpret the --help itself). These options let you change the MC that is being used, tweak the output of the cutflow script, or change the event weights that are being used (this is important for running systematic variations, but not for general running).

Some standard configuration files:

standard_3l_cutflow.yaml: shows the standard 3l cutflow.

Dumping Plots

Two different things need to be specified for plots:

which selections (cuts) you want to apply for the events to appear in the plots,
which plots you want to make.

These are specified in two sections of the yaml file:

cuts:
- name: standardSR_l3
  label: 3l
  cut: trilep_type>0&&passEventCleaning&&(lep_Pt_0>10e3&&lep_Pt_1>20e3&&lep_Pt_2>20e3)&&(top_hfor_type!=4)&&Mll01>12e3&&Mll02>12e3&&(abs(lep_ID_1)==13||lep_isVeryTightLH_1)&&(abs(lep_ID_2)==13||lep_isVeryTightLH_2)&&(lep_Match_EF_mu24i_tight_0||lep_Match_EF_mu36_tight_0||lep_Match_EF_e24vhi_medium1_0||lep_Match_EF_e60_medium1_0||lep_Match_EF_mu24i_tight_1||lep_Match_EF_mu36_tight_1||lep_Match_EF_e24vhi_medium1_1||lep_Match_EF_e60_medium1_1||lep_Match_EF_mu24i_tight_2||lep_Match_EF_mu36_tight_2||lep_Match_EF_e24vhi_medium1_2||lep_Match_EF_e60_medium1_2)&&abs(total_charge)==1&&((nJets_OR_MV1_70>=1&&nJets_OR>=4)||(nJets_OR_MV1_70>=2&&nJets_OR==3))&&(lep_ID_0!=-lep_ID_1||(Mll01<81e3||Mll01>101e3))&&(lep_ID_0!=-lep_ID_2||(Mll02<81e3||Mll02>101e3))
- name: standardSR_l4Zdepleted
  label: 4l Z depleted
  cut: passEventCleaning&&(top_hfor_type!=4)&&quadlep_type>0&&(lep_Pt_0>25e3&&lep_Pt_1>15e3)&&passTriggerMatch&&abs(total_charge)==0&&minOSSFMll==0&&passZVeto&&nJets_OR>=2&&nJets_OR_MV1_70>=1&&100e3<Mllll0123&&Mllll0123<500e3

for cuts, and

plots:
    - name: lep_Pt_0
      x: lep_Pt_0/1e3
      xlabel: 'p_{T}(lepton 0)'
      rng: [0,200]
      nbins: 20
      units: GeV
    - name: 'lep_Eta_0'
      x: 'lep_Eta_0'
      xlabel: '#eta(lepton 0)'
      rng: [-3,3]
      nbins: 20

for the plot specifications.

The simplest running of the code is
DISPLAY="" python dump_plots2.py standardCR_fornote.yaml --filedir '/data/tth/v4_processed/trigpdf/' --outdir output_dir_for_plots/ --texout standardCR_fornote.tex Many of the options to dump_plots2.py are the same as for do_cutflow.py (especially regarding the MC samples that are used).

Some useful configuration files:

standardCR_fornote.yaml: the control regions used in Note 3
standardSR_fornote.yaml: the signal regions