Code for Combined Single-top Analysis with Early-Data at 10 TeV

Overview

This page details how to use various macros to process the files used in the 10 TeV analysis. Files are located within SingleTopRootAnalysis/macros/SPR_macros. Files are meant to be used on the tier3 (see RemoteDesktop) with an installation of StatPatternRecognition. Files are set up for use with the /home/jenny/SPR-3.3.2 directory and /msu/data/t3work/single_top/SPR output directory. This directory also contains rootfiles and sprfiles that contain those particular file types. These directories are not generated by the scripts and should be made ahead of time.

The programs each have an example run command at the top of them.

The programs in general produce many files which are then run on the grid. These files are usually printed to a particular directory specified in the python script and then submitted to the grid by running a particular master script generated by the programs

Variable Importance

Programs:
  • make_tier3_runs_spr_varimp.py (trains classifiers using file with specifications for training classifiers)
  • make_tier3_sigonly_input.py (generates autoinput from specifications for training classifiers)
  • make_tier3_runs_spr_sigonly_all.py (runs autoinput: validation step, also uses a ROOT macro to calculate significances)
  • input_sigsep_var.py (an example file containing specifications for training classifiers. This in particular varies the variables used for each classifier generated)
  • spr_sys_apply_final3_validation_test_all.C (prints information to the screen, including the maximum significance with or without an 18 count requirement for the background, for systematics determined after a cut on the classifier output)
  • spr_sys_apply_final3_validation_test_allave.C (same as above, but calculates this for the systematics determined before a cut on the classifier output, an average)
  • sigonly_grab.py (grabs the relevant significance information from the .stdout files generated when the make_tier3_runs_spr_sigonly_all.py, and thus the ROOT programs, are run. i.e., this makes a file of all the significances so you can determine the best one)

Background Normalization

Programs:
  • make_tier3_runs_spr_sig_bkgdnorm.py
  • input_ttbarnorm.py (input file for program above, specifies training parameters and a name to use in files generated from these classifier parameters)
  • spr_sys_apply_final3_validation_test_bkgdnorm.C (ROOT program used- contains some code specifically for bkgd normalization settings and also a method to make a new tree with normalization cut events removed)
There are other bkgdnorm programs that are the same as this one but internally a short variable list is specified, or the background and signal files specified are different (for wjets vs ttbar). I would take this program and change the variable list internally to the standard list of 18 variables and then make two versions of this, one set up to isolate ttbar and one set up to isolate wjets.

NOTE: The make_tier3_runs_spr programs all have a similar format. Some only train, some only validate and some do both. Some allow bkgd and signal files to be specified internally with the variables used specified in the input file, others specify almost everything internally. Feel free to look at different versions and modify as needed.

NOTE: the current bkgd normalization doesn't have systematics included. To see how the code for this works, see make_tier3_runs_spr_varimp.py from the Variable Importance section. When including systematics, it may be easier to just modify the variable importance programs (specifying one variable list, possibly internally and leaving this option blank in the input file and modifying the signal and background files internally as mentioned earlier), and generate a separate training and validation run.

NOTE: the background normalization code is a bit old, so check the settings for the classifier source numbers (0,1, 2, etc) in the loops in the beginning of the code.

Comments

The background normalization has a different set of merged split files for training because of the different signals and backgrounds. I also made a different set of validation files, but this is not strictly nessesary. Note though that in other code we use wt and tchan signals separately in validation but, for the background normalization code as it is so far, they are combined together. If this changes, you need to add more code. Don't forget to change the variable classifierclassval to include another classl.

-- JennyHolzbauer - 07 Jan 2010
Topic revision: r2 - 07 Jan 2010, JennyHolzbauer
 

This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback