How to perform an analysis using the MSU Atlas Single Top Framework

0. Some words before I start

I've been learning about the framework since February 2008, and as I am not a programming freak, the explanations will start at a pretty simple level, which might be useful for other newcomers like me. I hope though, that with my growing knowledge and with the help of people with a lot more experience, this topic will evolve and become more and more useful.

Speaking of which, if you discover any error, be it wrong English, the wrong use of a term or something you simply know better or know a better explanation for, please do not hesitate to change it!!!

This twiki has been updated to explain the format used with the newest single-top group ntuples as of October 2010.

1. Introduction

The MSU analysis package is based on the D0 Single Top analysis package. At the moment it uses D3PDs (which are ROOT Ntuples) as input and produces histograms of various distributions. There is also an option to produce a skimfile, or a ROOT file containing the original tree, but only with events that passed the preselection cut (this option is less tested/used than the histogramming option). The code is relatively stable. Most of the code that is added and updated at this point are histogramming classes and analysis macros. The exception is that the code is updated whenever new MC is used. This is because new MC versions always seem to come with different tree variables, which mandate at least a change to the tree class header files. Sometimes major changes are made because the way we define, say, a kind of truth particle is no longer possible because of the information provided by the tree.

Why do we have a framework?
We have a framework because it simplifies basic steps of analyses and helps coordinating the work and the results of a group.

How does the coordination work?
For the coordination something called CVS (Concurrent Versions System) is used. You check out the package. If you have modifications that are useful for everyone, you can put these back into the CVS. You can also get the modifications from other users.
  • Note that in the future this package will be moved to SVN rather than CVS, although this is not yet the case.

2. How to get started in SVN

1. Check out the package:

You will need to type the following:
  • svn co svn+ssh://username@svn.cern.ch/reps/atlasgrp/Institutes/MSU/MSUSingleTop/SingleTopRootAnalysis/trunk
Where username is your CERN username. If your username is the same as the name at the place you are copying the package too (like the tier three) you won't need the username@ part for this to work (I think). This package will download under the name SingleTopRootAnalysis. All of the work, SVN updates, etc. will take place in the directory SingleTopRootAnalysis/trunk

To check out a particular tag of the package: svn co svn+ssh://username@svn.cern.ch/reps/atlasgrp/Institutes/MSU/MSUSingleTop/SingleTopRootAnalysis/tags/SingleTopRootAnalysis-00-00-01 SingleTopRootAnalysisTag

2. Update latest modifications and see changes available

  • simply type "svn status" while being in the directory "SingleTopRootAnalysis" as specified in step one.
This will show if there are any modified files (M) which are files that you have changed, updated files (U) which are files other people have changed, conflicting files (C) which are files other people have updated and you have changed and so are in conflict, and new files (?) which are files not in the main package. Any modified files will have their changes added to the package if you do a commit. Updated files will be added to your local directory if you do an update. Conflicts are more complicated. You might want to consider moving the conflicting file in your local directory to some other location, doing an update to get the current SVN version, and then adding your changes to this file (which should now have an M tag). There are multiple users with commit rights so you need to be careful not to make changes that will badly impact what others are doing. If there are new files, you will need to add them.

It is recommended to do a "svn status" before doing a commit.

3. Other useful commands

Basic commands are described in more detail here http://www.linuxfromscratch.org/blfs/edguide/chapter03.html, but generally follow the CVS forms. To add files, you say "svn add" and then the file. To delete a file locally and in the svn repository, do "svn delete" and then the file name. "svn update" will put any files changes in the svn repository into your local directory (this doesn't change the main pacakge in the repository, only your local package). "svn commit" will add any modifications you made to files in your local directory to the main package repository. Always do a status check before committing to avoid conflicts!!

It is recommended to do a "svn status" before doing a commit.

4. Web location

SVN packages are available for web browsing as well. Our package is available here: https://svnweb.cern.ch/trac/atlasgrp/browser/Institutes/MSU/MSUSingleTop/SingleTopRootAnalysis/trunk

5. Making a branch/tag (advanced)

Sometimes you want to make many modifications in the main code classes for your own project that would conflict with the main package, or the package needs to be updated for new D3PD files and you want to save a copy of the old package. Tags are snapshots of the package, branches are separate development versions. ATLAS has the following instructions:

svn cp svn+ssh://username@svn.cern.ch/reps/atlasgrp/Institutes/MSU/MSUSingleTop/SingleTopRootAnalysis/trunk -r 66558 svn+ssh://username@svn.cern.ch/reps/atlasgrp/Institutes/MSU/MSUSingleTop/SingleTopRootAnalysis/tags/SingleTopRootAnalysis-00-00-01 -m "inital package tag, rel 15"

The number associated with -r should be the revision number that showed up with the last commit you did. The tag number should be new, preferably the next one not already in the tag directory.

To make a branch, you must have made a tag first. Then you can make a branch as follows:

svn cp svn+ssh://username@svn.cern.ch/reps/atlasgrp/Institutes/MSU/MSUSingleTop/SingleTopRootAnalysis/tags/SingleTopRootAnalysis-00-00-01 svn+ssh://username@svn.cern.ch/reps/atlasgrp/Institutes/MSU/MSUSingleTop/SingleTopRootAnalysis/branches/SingleTopRootAnalysis-00-00-01-branch

The last part gives the location in the branches directory. The name of the branch must follow this format (tag name plus a -branch). You can also add a -m option to include a log message. If the form of the branch name is wrong it won't commit.

If you make changes on a branch, you will need to do an svn switch to the branch before comitting them, following the useful commmands twiki mentioned earlier in this section.

2.5. How to get started in CVS (depreciated)

CVS is no longer being supported and the package has been ported over to SVN. The package branches are not available in SVN and commits are no longer available to CVS. If you are starting out with the package, please see the SVN instructions

This section I basically took from the MSU Atlas Group webpage [1].

1. Check out the package

2. Update latest modifications

  • simply type "cvs update" while being in the directory "SingleTopRootAnalysis".

3. Put your modifications into CVS

--> Note: you will need permissions to add files or changes to files to CVS
  • go to directory SingleTopRootAnalysis
  • type "cvs -n update" to see what will be updated without actually updating it (in order to avoid conflicts)
  • to add new files to CVS type "cvs add "filename""
  • to remove files from CVS type "cvs remove "filename""
  • to upload files type "cvs commit -m "Comment for committed version""

4. Branch

The basic MSU Analysis package can be seen as a "trunk". It is therefore possible to add branches.
  • to create a new branch, type: "cvs rtag -b yourbranchname groups/SingleTopRootAnalysis"
  • to use this branch, you need to check out the package with the following command: "cvs co -r yourbranchname"
  • change something, then type "cvs commit", the changes will go back into the branch

3. Structure of the package

Now you have the SingleTopRootAnalysis package somewhere in your work area and it should look like this:

bin lists
build macros
cmt Makefile
config obj
CVS scripts
dep SingleTopRootAnalysis
doc src
lib tmp

Contents of the folders:

bin executables (c++ programmes that tell the computer what we want to do)
lists lists of root files - these contain the local locations of the rootfiles (monte carlo data files) we use
macros macros to perform analysis with produced histograms
Makefile not a folder, but the Makefile itself- this is used when compiling code
scripts scripts which run the desired executable with the desired command line options
SingleTopRootAnalysis header files of classes
src classes

4. How to do an analysis, using the framework

You may wish to read section 5, How to Run the Example, before making new classes with this section. Some of that section is repeated in the instructions below.

A. Producing histograms

  1. Write a class (cpp) and its header file (hpp), store in the proper folders
  2. Write an executable or modify one, for example bin/example/test_analysis.cpp, so that it includes the header files of your Histogramming classes listed at the top of the executable.
  3. Compile using make, for our example type: "make Example". (You have to be in the upper SingleTopRootAnalysis folder, otherwise the computer cannot find the MakeFile). NOTE: when compiling, ALL programs in src and SingleTopRootAnalysis are compiled, including ones that you have added. However, new executables must be added by hand to the MakeFile to be complied. If you use an existing directory, just add NewProg.x to the line with the others for that directory. If you add a new directory, you have to make changes in more places in the MakeFile- follow the format that is already there. For a list of other compiling options, type "make help". If it won't compile and you may have syntax errors in your new classes, and incorrect version of ROOT linked, or need to recompile by typing "make clean", which removes all libraries, objects, etc., and then "make Example".
  4. Write a run-script (.sh) or modify scripts/ExampleScript.sh, so that it includes your class (after compiling with an .x-ending instead of a .cpp ending), the name and location of the input and the output file as well as the number of events you want to look at.
  5. Run your script, in our example we would type: "./scripts/ExampleScript.sh" from the main SingleTopRootAnalysis folder

B. Analyzing histograms

These classes are often very analysis specific. To get an idea of what things have been written in the past, feel free to explore the macros directory. It also contains some files used for multivariate analyses with TMVA and SPR, as well as some python scripts and pyroot applications.
  1. Write a root macro (.C) and store it in the proper folder, of course you have to tell your macro the location and names of your new histograms. You are welcome to write functions and fancy stuff in order to include it in your macro
  2. Open root and run the macro by using ".x macros/yourmacroname.C". You can also run this by typing ".L macros/yourmacroname.C" to load it and then "yourmacroname()" to run it

5. How to Run the Example

The example is a set of files meant to show you how the analysis package works.
  • Currently updated for use with the single-top group files, version 15 (MC09) (Single-top D3PDs)
  • To run the example, you will need to get the files from CVS as mentioned earlier
  • Uses the test_analysis.cpp file in bin/example as well as several configuration files located in the config directory with the Example keyword in the names
  • To run the package, you will need to modify a script file (technically, you can just run the package from the command line, but in practice it is more convenient to use a shell script file).
    • The modification is needed because you will need to let the program know where your D3PD files are located on your computer
    • In general, we use list files to specify D3PDs that will be chained together during processing (ex. all Wt-channel files will be listed in one list file)
    • In the example script, there is a line commented out that shows how to call a list file, and you can look in the list directory to see how these files are formatted.
Some of the information from this section is also repeated below, when the package is overviewed in more detail.

1. To Run the Package:

Usual, More Generic Running Method:
  • To compile the package and generate the executable test_analysis.x, Type
     make Example 
    • If you run into problems, try typing
       make clean 
      and then
       make Example 
    • NOTE: If you are on a 32 bit machine, it should compile with a standard ROOT version (currently v26). If you are on a 64 bit machine, you will need a 64 bit version of ROOT to run this package.
    • If this doesn't work, contact Jenny Holzbauer for help. Either something happened when you got the package from CVS or the example is missing an update.
  • Open scripts/ExampleScript.sh. The first uncommented line will contain the -infile flag (-infile /work/data/test.root). Change the file name and location to match yours.
  • Be sure that the ExampleScript is executable with ls -ltr. If it is not, use chmod to change this.
  • Type ./scripts/ExampleScript.sh to run. You should generate a root file and a log file.
  • If the program does not run correctly and the log file has an error in it, you may be using an incompatible version of root. Most of the new MC requires a newer version of ROOT, like 5.20. Chances are, the version you have is fine. Also, you may want to source a .bashrc file before running the program, since it uses ROOT.

Less Generic Running Method, Especially For Multivariate Variable Tree Generation
  • To generate a root file with a tree of variables for multivariate analysis you should follow the instructions in the ExampleScript.sh and bin/example/test_analysis.cpp files. A general outline is repeated below:
    • You will need to use the MVATreeName flag to specify the single-top D3PD tree and cut flow you wish to use (for example, SgTopElectronJetsPreTagTree)
    • You may want to use the OnlyDoMVATree to shorten running time and prevent the generation of things like jet vectors from the EventTree. This is a pretty severe flag though and you can only do a limited number of things with it turned on, specifically
      • You can only take the following cuts: CutDecisionTree, CutMVATriggerMatch, CutMVATriggerSelection. These should be the only cuts nessesary to replicate a single-top group designated cut flow.
      • If you wish to test your own cut flow or do other studies like trigger or heavy flavored w+jets studies, you will NOT want to use this flag, although you can still apply the CutDecisionTree cut (and use either the MVA trigger cuts or a different trigger cut that doesn't require the use of an MVATree, in progress)

2. Config Files:

  • You do not need to change the config files to run the program, but in practice you will want to
  • The weight config file sets the weights to 1 for this example
    • In general, you will want to weight to some integrated luminosity
    • In doc/weights there is a python program which will generate a weights file to weight events to 100pb-1.
    • It needs the samples_XXX.txt input file, which lists the cross-section*branchingratio*otherthings from MC website, and k-factor. You will need to follow the link to the MC09, MC10, etc. 7 TeV samples depending on what files you are using. The weight itself takes this quantity and divides it by the unweighted MC counts for the sample in question (unweighted unless it is MCatNLO, then it includes the MCatNLO weighting). The package automatically gets the MC counts from the D3PDs

3. Program Output:

The log file contains information about what settings you used (b-tagging weight cut, trigger cut, etc), as well as a list of the histograms the root file will contain. At the beginning of this list is the source name and reference number provided (if provided) as well as the event weight used. The file then lists information output as the events process, including error messages and a message about event count every 10,000 events or so. When the program finishes running, a table will be generated if cuts have been made, showing the number of events before and after cuts. Finally a summary of run time and total events are given. If the program has problems when running, error messages will be stored in this file. To kill a misbehaving run, type ctrl c. To look at the root file, just go into root and use the TBrowser (type new TBrowser). Here you can look at the histograms before the cuts, the histograms generated when a cut is made, and the histograms after cuts.

4. Cuts Versus Object Selections:

However, something to keep in mind is that the settings made in the objects file are applied in all of these histograms. In other words, if the objects file requires that jets have pt > 30, then a jet pt histogram before and after cuts are made will show only jets with a pt > 30. These object cuts define what we mean by a "jet" whereas the cuts in the cuts file are preselection cuts on these objects. Thus, when we cut to require at least two jets, we required at least two jets with pt > 30 and |eta| < 5.0. For leptons, historically we have used tight and veto objects. The current cut selection only requries one tight lepton, so only tight lepton objects have the cuts applied to them. This means that when we require exactly one tight electron, there could be an event with two electrons, as long as one electron is soft enough to not qualify as "tight".

5. A Few More Object Selection Comments:

  • The object selections are currently limited to Pt selections for the muons, electrons, and jets. All other object selections and overlap removals for the single-top group are incoporated into a UseOverlap flag if you are doing a Generic run. If you are looking at some of the single-top designed multivariate variables, you don't need to make any object selections because the single-top group has already provided a flag to remove events based on its object selections and calculated the variables you need based on its object selections (See the CutDecisionTree in the Less Generic Running Method in the Run the Package section)
  • Historically we have applied things like overlap removal, isolation, etc. ourselves with this package and the coding structure is still there if you wish to test this sort of thing out.
  • The b-tagging algorithm and cut are specified in the object file. In the example, SV0, an early data b-tagger is used. Some files may not have this b-tagger, and the cut level will change for different MC, so be sure to check with the group for b-tagging algorithm and cut recommendations before doing an analysis. Also, b-tagging scale factors are now starting to become available and will be included in a later update of the code, when this information is incoporated into the single-top twiki. A flag for Jet energy scale adjustments may also be set here.

6. Closer look at the different elements

1. Classes and their header files

A class is a c++ program and an essential part of the concept of object oriented programming. It can inherit functions from parent classes. It consists of two parts: The source file (.cpp) in which you tell the computer what you want to do and the header file (.hpp) .

In the header file you have to declare which classes you want to include in your new class (you do that, so that you can use their functions).
First, in a class HistogrammingVar.hpp, there come two lines
#ifndef HistogrammingVar_h
#define HistogrammingVar_h
which necessary for everything to compile. If you are copying and pasting code from another program, it is important to remember to change these two lines to the new programming name. Then there are several lines refering to other programs that this one will be referencing:
#include "SingleTopRootAnalysis/Base/Dictionary/HistoCut.hpp"
#include "SingleTopRootAnalysis/Base/Dictionary/EventContainer.hpp"
#include <string>
#include <vector>
#include <sstream>
And the following line:
 using namespace std;
Then come the constructor, deconstructor, method declarations, and inline function:
HistogrammingVar(EventContainer *obj);
~HistogrammingVar();
// methods:
void BookHistogram();
bool Apply();
//inline function
inline std::string !GetCutName (void) { return "HistogrammingVar"; };

and after this comes the private variable declarations. Notice that the constructor has the event container object in parenthesis. This is so that it can pass this object into the .cpp file, so that our methods that make histograms can access the variable information necessary to do this. Also notice that the event container class has been included in the #include lines. This is necessary so that the computer understands what this EventContainer object is, when it passes it through the constructor. The deconstructor runs at the end of the program and is denoted by a tilda "~" in front of the name. The two methods will be discussed more in the next paragraph. However, notice that the first method is void (doesn't return anything when it finishes running) and the second is not (it returns true or false when done running). The inline method is used by AnalysisMain.cpp when running the whole analysis package.

Finally you have to define the histograms you want to produce with your class. These histograms are generally declared under private variables. These are variables that can be accessed in any method in the .cpp file. These variables are typically distinguished by the use of a "_" in front of the variable name. Histograms are included here because they are accessed in at least two methods in the typical histogramming .cpp file: BookHistogram() and Apply().

A simple header file (with comments) can be seen here: headerfile.pdf

The cpp-file for generating histograms contains all the information about how you want to fill your histograms. You can use normal root syntax, of course.
You will need to include the corresponding header file!
In the section BookHistogram(){blubb} you should declare your histograms, all the calculation and filling goes into Apply(){blubb} . Typically the .cpp file starts with:
 
#include "SingleTopRootAnalysis/Histogramming/HistogrammingVar.hpp"
#include <iostream>
#include <sstream>

using namespace std;
The first line tells the computer what the header file is and the others are included classes. After this, we write the constructor and deconstructor methods:
HistogrammingVar::HistogrammingVar(EventContainer *EventContainerObj){
  SetEventContainer(EventContainerObj);
}

HistogrammingVar::~HistogrammingVar(){
}

In this case, and in general, the only act of the constructor is to make sure to set the event container that the program is using. The deconstructor doesn't have any additional function, other than existing. After this, the BookHistogram() method is written. Here, we declare the histograms that we want to fill. These should have been declared as private variables in the .hpp file so that we can access these histograms again in the Apply() method. In the Apply() method, then, we access the event container, usually by typing something like:
 EventContainer *obj=GetEventContainer();
and then we access the particle information from the event we are looking at. For example, to access the fourth jet Pt, we would type:
EventContainerObj -> jets[3].Pt()
Here there are a few important things to note:
  1. The Apply() method loops over all of the events automatically. This means that when you access the fourth jet and put it into the histogram within the Apply() method, you are doing this for all of the events. This is not true, for instance of the BookHistogram method.
  2. The information from events is contained in arrays, so access the fourth jet means accessing the third element of the array.
For the curious, the name "jets" refers to an array of jet TLorentz four-vector objects that is filled in the EventContainer .cpp file, and the funtion .Pt() that returns the Pt() value is a method of TLorentz vectors (this is a root class). There are other methods that are specifically programmed into the Jet.cpp class that could also be called to return other types of information.

A simple source file (with comments) can be seen here: sourcefile.pdf

1.1 Comments on Other Classes

Histograms are important, of course, but there are other class files that are important too. If you are changing data sets, you will probably want to modify the lists of tree variables. These are actually in the header files for the tree you are interested in. There are also classes for each of the particle types. These can contain particle-specific variables and information. In the case of Reco (reconstructed) particles, this is where the variables such as Pt are obtained from the tree. B-tagging information is located in the Jet class. When you run the script file, one of the parameters is the type of b-tagging and the parameters from the tree (and any calculations) that determine if a jet is b-tagged or not are located here, within an if-statement for each b-tagging type. The MC particles work much the same way, although the event container has to take extra steps to separate the B and C quarks, which are lumped together in the truth tree.

1.2 Some Important Classes

There are a lot of classes in this package. Here, I will overview a few important ones. EventContainer (src/Base): This class is a big one in terms of size and importance. Basically, this class decides what happens to each event. It is this class that sorts the particles into their individual vectors for each event. The vectors themselves are created within this class and event reconstruction also happens now, in code near the end of the class. The order in which the particles are filled matters, as some particles require vectors of other particles to be passed in when assignment time comes. Overlap removal, isolation, removal of electrons from the crack region, isolation and object cuts all occur here when the particle class returns true or false and the vector is either filled or unfilled.

Jet, Electron, etc. (src/Particles/Reco): These are the particle classes. This is where the definition of what a particle is really happens. The object config file and any other particle vectors needed are fed into the class which then returns either a true or false to the event container, telling it to fill or not fill the vector with this particle. The definition of isolation is applied here as well as overlap removal (overlap removal is also built into the order in which the particle vectors are filled in the event container). In the case of electrons, there are several different types: tight, veto, all, etc. By looking at the conditions at the end of the fill method for returning true or false, you can see the differences between the objects and what they really are. In the case of the Jets, it is good to take a look at the class to see what b-taggers are written into the code and what the b-tagging algorithm keywords are that you will need to mention in the config file to specify which b-tagger you are using. Whether a jet is tagged or untagged is set based on the parameters set in the object file (or command line) and interpreted in Jet.cpp.

AnalysisMain (src/Base): This program actually steers everything else, including the event container. It is only second to the executable itself, which uses an AnalysisMain object (mystudy). This class is responsible for most of what prints out to the screen (or log file). It also takes the config file and command line arguments and sets the b-tagging algorithm, cut, input file names (or input file chain), etc. Its loop method actually loops over the events.

RecoTree (SingleTopRootAnalysis/Tree): Classes in the tree directory, like RecoTree, contain information about the tree format, including the variables used. These are generated and changed each time the Monte Carlo changes. If you want to plot a variable and try to access it within the Electron class, for instance, it must be specified in this class somewhere, or you can't access it. Sometimes, if things are not properly updated, an extra branch will be in this class that isn't in the tree and this will produce a warning message on the screen or in your log file. There are not usually problems however, as this is one of the first things that is updated and tested for new MC.

2. Executables

The executables bundle the classes together in one program and call the root trees (in general). They contain a lot of stuff but the most important thing for beginners is simple: There are two spots in the executable, where you have to include your classes.

Most of the other stuff is related to chaining together the root files that are listed in the list files (the program knows what list to use from the script that you run, so you have to have a seperate run line for each list). The program then access all of the root trees and can run histogramming classes. The first class should be the following, which figures out the weights:
mystudy.AddCut(new EventWeight(particlesObj,mcStr));
Then comes your class, and other histogramming classes. Additionally, at this point you can add in a cut, in the same way that you add a histogramming class. For instance, to add a jet number cut, you would type:
 mystudy.AddCut(new CutJetN(particlesObj));
The program knows how many jets to cut on from the config file, which is discussed below. After this, the program actually loops over the events (mystudy.Loop()) and closes the file.

These two spots you need to change to add you own histogramming class are marked in our example here: executable.pdf

3. Makefile

The Makefile is a fancy thing that allows us easy and fast compilation as it finds out which files have changed since the last compilation and compiles only these. This has grown more complicated than in the past, but it also doesn't require many changes. If you make a new executable (in the bin directory) then you need to add this name in the executable section in the makefile. For instance, the line in this section corresponding to making files in the example directory looks like this:
BINS_Example= bin/example/test_analysis.x
If I made a new executable in this directory called test_analysis_two, I would add it into the MakeFile as follows:
BINS_Example= bin/example/test_analysis.x  bin/example/test_analysis_two.x
Then, when I type make Example, both test_analysis and test_analysis_two will be compiled. If I want to make a whole new directory for my files, then you also need to change things in two other areas. The first is in the executable catagories section, and the second is in the executables section. You should also add a comment about your new directory in the help section.

When compiling the code, you can compile the changes that have been made since the last compile (in the example, for instance) by typing "make Example". If you want to compile all of the code pertaining to making the example executable then you type "make cleanExample", and then "make Example". This takes much longer than just compiling recent changes. However, if you keep running into errors that you can't fix, doing this sometimes helps.

4. Script

We use a script (.sh) to bundle the necessary information for our run: the used executable, the input and output files and the number of events we want to consider. The script does not have to be compiled. To print information to a log file instead of to the screen, type
& logfile.example.txt 
In this case, it would print all of the screen output to a .txt file called logfile.example.txt in the current directory.

Some other options are: -bTagAlgo default -MCatNLO. The first sets the b-tagging type to default (you can see the other options in Jet.cpp). The second tells the program that there is MCatNLO weighting (this is typical for ttbar and single top signal files, but check to be sure which files you need this for). Although you can still specify the b-tagging algorithm and corresponding weight cut from the commandline, it is now possible (and preferred, for most applications) to specify these in the object file, as is done in the Example (see section 5).

Here you can see an example for a run script: script.pdf

5. Config Files

Configuration files are referenced in the script run lines and are used when the code is running for information on what the cut level should be, and what weights should be applied. There are several different types of config files. The first, located in SingleTopRootAnalysis/config is basically a master config file that contains the location of other config files. This is the config file that is referred to in your script file. The name typically contains a number that refers to the version of the data it is/was used for, an analysis name (for instance, Trigger refers to trigger analysis work), and a reference to the types of cuts used, often differentiated by particle channel and/or number of b-tags. This master config file contains the debug level (typically 10), the top mass used by the program (typically 172.5 GeV), and references to the cuts, weights, object and NNweights config file. The NNweights config file is not really used, but still specified for historical reasons.

The weights, cuts and objects config files are located in subdirectories of those names. The objects config file contains cuts that occur on all of the events that are processed. If you want to run over all of the root files without any cuts at all, you need to set these cuts to 999., even if you are not making any cuts explicitly in your executable. These cuts are also the ones that differentiate what a tight muon object is, versus a veto (or loose) muon.

The cuts config file simply has a list of the cuts and their cut values. A value of 999. means that no cut is applied. The cuts in this file will not be applied to the sample unless the cut program is run in your executable. The number of events before and after the cuts will show up in histograms in the resulting root file and in a cutflow table that is printed to the screen. Cuts are applied to objects, so if a jet object was required to have pT > 30 GeV, a cut on jets requiring pT > 20 GeV would not remove any events. Lepton objects may have different specifications applied, and there are different labels for each object type, like tight. Requiring 1 "tight" electron may result in a different number of events passing the cut than requiring 1 "all" electron.

The weights file contains the data file number and the weight that is associated with it. This number must be the same as the one in the list file. For example, t-channel data is file number 5502, and this is the number contained in the list file name itself, and in the second line of the list file (this line is how the program knows what it is). The weights themselves can be calculated using a python script in doc/weights called weights.py. This script requires an input file that contains the branching ratio*cross-section*other factors to be included, as well as the k-factor, and the total number of monte carlo events for each data file list, which is again identified by the number (which, for t-channel, is 5502). The total number of unweighted monte-carlo events does include the MCatNLO weight for those events generated with that generator (usually single-top signal and ttbar). You can run the program by typing "python weights.py inputfilename.txt" and the program will produce an html file, with html code for the weights arranged in a table, and a file called SingleTopWeights.weights, which contains the weights in the form they should be in for inclusion in a weights config file. The code currently weights events to 100pb-1. Not that this weight will not weight events to this luminosity if you use the -num ### command in the script file to reduce the number of events you are running over. When running the code, this weight should show up below the data's name, and above the list of histograms made. The default weight is 1.0.

There is additional weighting that is done within the program. The first is MCatNLO weighting which is applied to MC files made from a certain generator (usually single-top signal and ttbar files). This is called in the script line using "-MCatNLO". Another is a program that weights events to account for the effect of the trigger in fastsim files. This is referenced in the weights config file by "Weight.Trigger: 0.9" for a constant weight or "Weight.Trigger.File: /Directory/TurnOnFit.root" for a variable weight (this is the root file of the turn-on curve). The required program, which is located in Cuts/Weights, is then called in the executable. Although this was used for the CSC note analysis, it has not been used recently. However, it will likely be used again, so keep it in mind.

6. Macros and functions

Once you created your histograms, the fastest way to analyze them is by using root macros (.C). In these macros you create a function with the name of the macro. This function should load your histograms and do something with them.

This is one of my functions as an example: macro.pdf

7. Syntax & Specialities

As the framework was written in order to simplify our work, there is a lot of syntax and tricks to learn if you really want to benefit from it.

1. The EventContainer

The EventContainer allows easy access to particles as it bundles the important properties of a particle as a vector. It also makes the looping over all the events in a root file very easy.

If you want to use the EvenContainer in your class, you have to define the function in your header file:

// Parameterized Constructor
HistogrammingPolarization(EventContainer *obj);
// Destructor
~HistogrammingPolarization();

You also have to call it in your source file:

HistogrammingPolarization::HistogrammingPolarization(EventContainer *obj)
{
SetEventContainer(obj);
}
HistogrammingPolarization::~HistogrammingPolarization()
{
}

In order to loop over all events simply write:

EventContainer *evc = GetEventContainer();

The great thing now is that if you want to call, let's say a d-quark, you can loop over all evc->MCParticles[i] and find the one with evc->MCParticles[i].GetPdgId == 1 .

8. Helpful links

About the framework:
[1] http://www.pa.msu.edu/hep/atlas/index.php?p=msu_single_top/msu_analysis_package

How to get it:
[2] http://atlas.web.cern.ch/Atlas/GROUPS/SOFTWARE/OO/tools/cvs_server.html

-- SarahHeim - 10 Mar 2008 -- JennyHolzbauer - 17 Mar 2008 -- JennyHolzbauer - 19 Jun 2008 -- JennyHolzbauer - 26 Aug 2009
Topic attachments
I Attachment Action Size Date Who Comment
PolarizationJets.CC PolarizationJets.C manage 3.6 K 13 Mar 2008 - 12:43 SarahHeim  
executable.pdfpdf executable.pdf manage 245.3 K 11 Mar 2008 - 22:56 SarahHeim  
headerfile.pdfpdf headerfile.pdf manage 191.5 K 11 Mar 2008 - 22:08 SarahHeim  
macro.pdfpdf macro.pdf manage 177.2 K 11 Mar 2008 - 23:53 SarahHeim  
script.pdfpdf script.pdf manage 171.2 K 11 Mar 2008 - 23:31 SarahHeim  
sourcefile.pdfpdf sourcefile.pdf manage 255.6 K 11 Mar 2008 - 23:01 SarahHeim  
Topic revision: r163 - 11 May 2011, JennyHolzbauer
 

This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback