Analysis Comparisons for Code Version 13030

Run Over D3PDs using the MSU analysis code

The DPDPs are stored on the CERN cluster in the directory /rooms/living/ntuples/single_top/TopPhysDPDMaker/13.0.30/D3PD/ The following samples were included in the analysis:

Signal: 5500, 5501, 5502

Background: 5200, 6410, 6411, 6412, 6413, 6414, 6415, 6416

Only those files in each sample which ran without crashing the analysis code were used. The bad files are commented out in the list files.

The output files from the MSU analysis code are stored in the CERN cluster in the directory /home/root_files/single_top/TopPhysDPDMaker/13.0.30/FDR2/ The samples in the directory unweighted were processed without an event weight. The samples in the directory weighted were weighted so that the number of events corresponds to a luminosity of 100 pb-1.

Merge Signal and Background Files

The 3 signal files were combined into a single signal file called Topology.!SingleTop.13030.FDR2.Electron.Signal.root. The 3 background files were combined into a signal background file called Topology.!SingleTop.13030.FDR2.Electron.Background.root. Note that "Electron" is replaced by "Muon" for the muon chain in the file names. The merging was done using the !MergeTrees.C routine located in the macros/TreeManiuplation in the MSU analysis package and the merged files are in the directory /home/root_files/single_top/TopPhysDPDMaker/13.0.30/FDR2/merged/.

The merged files were divided into training, validation, and yield samples, which was achieved by setting the Split flag to 1 in the config file. The events were split according to their order in the merged file. For example, the first event was categorized as training, the second as validation, the third as yield, the fourth as training, and so on. Functionality to randomly assign variables to the different categories will be added in the future.

Events with negative weights were excluded from the Training sample to retain compatibility with SPR, which cannot handle negative weights. Validation samples both with and without events with negative weights were made. The files without events having negative weights are signified by the label NoNeg.

Run TMVAnalysis over Training files

For comparison purposes, only the following variables were considered for training: HT, Jet1Pt =, =DeltaRJet1Jet2, WTransverseMass.

TMVAnalysis.py was run over the Training sample of the merged root files. The following command was used to execute the python script:
 
python TMVAnalysis.py \
   -S /home/root_files/single_top/TopPhysDPDMaker/13.0.30/FDR2/merged/Topology.SingleTop.13030.FDR2.Electron.Signal.Training.NoNeg.root \
   -B /home/root_files/single_top/TopPhysDPDMaker/13.0.30/FDR2/merged/Topology.SingleTop.13030.FDR2.Electron.Background.Training.NoNeg.root \
   -t "TopTree TopTree" \
   -o TMVAout.root \
Other methods will be used in the future. Also, 1 signal event was used for signal validation and 1 background event was used for background validation. This was achieved by the following line in TMVAnalysis.py:
factory.PrepareTrainingAndTestTree( mycutSig, mycutBkg, "NSigTrain=10000000000:NBkgTrain=100000000000::NSigTest=1:NBkgTest=1:SplitMode=Alternate:NormMode=NumEvents:!V" )
Using 0 or 1 NTest events instead of 2 events caused the program to crash. I'm not sure why using only 1 caused it to crash.

TMVAnalysis outputs the root file TMVAout.root and information in the weights directory.

The text output from running the program can be found here: AnalysisTxt13030

Run TMVApplication over Validation files

TMVApplication.py was run over the Validation samples (with events having negative weights). Signal and background samples were run separately using the following commands.

Signal:
 python TMVApplication.py \
   -i /home/root_files/single_top/TopPhysDPDMaker/13.0.30/FDR2/merged/Topology.SingleTop.13030.FDR2.Electron.Signal.Validation.root \
   -o Signal.root 

Background:
 python TMVApplication.py \
   -i /home/root_files/single_top/TopPhysDPDMaker/13.0.30/FDR2/merged/Topology.SingleTop.13030.FDR2.Electron.Background.Validation.root \
   -o Background.root 

Note that the number of bins in the classifier outputs is 50. This is set by nbin = 50.

Calculate Significance

Significance.py was run in order to calculate the significance. For now, the significance was taken as Signal/sqrt(Background). A more accurate, and complicated, calculation of the significance will be performed in the future. Events were weighted by a factor of 3 to account for the splitting into training, validation, and yield samples.

The classifier output distributions are shown below for signal and background.
BDT.png

The significance was calculated for different cuts on the classifier output. This was done in two ways, by cutting as you move right across the values on x-axis and by cutting as you move left across the values on the x-axis. Both are shown below.
significanceBDT.Hist.Right.png significanceBDT.Hist.Left.png

Comparisons with Jenny


compare.png

-- PatRyan - 14 Nov 2008

Topic attachments
I Attachment Action Size Date Who Comment
BDT.pngpng BDT.png manage 14.7 K 17 Nov 2008 - 17:43 UnknownUser  
compare.epseps compare.eps manage 14.7 K 17 Nov 2008 - 17:42 UnknownUser  
compare.pngpng compare.png manage 23.9 K 17 Nov 2008 - 17:44 UnknownUser  
significanceBDT.Hist.Left.pngpng significanceBDT.Hist.Left.png manage 15.4 K 17 Nov 2008 - 17:34 UnknownUser Significance calculated from Left
significanceBDT.Hist.Right.pngpng significanceBDT.Hist.Right.png manage 15.2 K 17 Nov 2008 - 17:33 UnknownUser Significance calculated from Right
Topic revision: r5 - 16 Oct 2009, TomRockwell
 

This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback