SPR for Version 14.05.006

Running Bagger

The bagger (decision tree) classifier was run for comparison purposes. The .pat files associated with this classifier are shown below. Note that a "_" is needed before the names of the leaves.iii

The contents of baggerValidation.pat is
Tree: TopTreeBkg TopTreeSig
TreeClass: 0 1
Leaves: _HT _Jet1Pt _DeltaRJet1Jet2 _WTransverseMass
WeightVariable: _EventWeight
File: /home/root_files/single_top/TopPhysDPDMaker/14.05.006/EarlyData/CombineSigBkg/Topology.SingleTop.1405006.FDR2.Electron.Validation.root

The contents of baggerTraining.pat is
Tree: TopTreeBkg TopTreeSig
TreeClass: 0 1
Leaves: _HT _Jet1Pt _DeltaRJet1Jet2 _WTransverseMass
WeightVariable: _EventWeight
File: /home/root_files/single_top/TopPhysDPDMaker/14.05.006/EarlyData/CombineSigBkg/Topology.SingleTop.1405006.FDR2.Electron.Training.NoNeg.root

Run Training and Validation Simultaneously

Run the boosted decision classifier.
/usr/local/bin/bin/SprBaggerDecisionTreeApp  \
  -n 1100 -l 6 -s 33 -g 1 -y '0:1' -d 5\
  -f bagger.spr -t data/baggerValidation.pat -o baggerOutput.root\
   data/baggerTraining.pat  \
The options are as follows
  • -n 1100: number of Bagger training cycles (the number of decision trees)
  • -l 6: minimal number of entries per tree leaf (def=0) (entries per terminal node of the decision tree)
  • -s 33: max number of sampled features (def=0 no sampling)
  • -g 1: per-event loss for (cross-)validation is quadratic loss (y-f(x))^2
  • -y '0:1': list of input classes (0 corresponds to background and 1 to signal)
  • -d 5: frequency of print-outs for validation data
  • -f bagger.spr: store trained Bagger to file
  • -t data/baggerValidation.pat: read validation/test data from a file (must be in same format as input data!)
  • -o baggerOutput.root: output root file (is this validation or training)
  • data/baggerTraining.pat: The file used for training

Part of the output is shown below. The training file has 1123 background events and 115 signal events, giving 1238 events in total. The validation file has 1268 background events and 119 signal events, giving 1387 events in total. The end of the output shows that 1238 events were put into the root file, which corresponds to the number of events in the training sample and not the validation sample. The output root file baggerOutput.root is the same as the output root file baggerOutputTraining.root produced below.

Parsing File: data/baggerTraining.pat
Found file: /home/root_files/single_top/TopPhysDPDMaker/14.05.006/EarlyData/CombineSigBkg/Topology.SingleTop.1405006.FDR2.Electron.Training.NoNeg.root start: 0 end: -1 class: 0 weight: 1
A variable determined weight has been chosen, the value assigned to 
        _EventWeight
 will be used for the weight.
Reading File: /home/root_files/single_top/TopPhysDPDMaker/14.05.006/EarlyData/CombineSigBkg/Topology.SingleTop.1405006.FDR2.Electron.Training.NoNeg.root for Tree: TopTreeBkg (1123 events)
Reading File: /home/root_files/single_top/TopPhysDPDMaker/14.05.006/EarlyData/CombineSigBkg/Topology.SingleTop.1405006.FDR2.Electron.Training.NoNeg.root for Tree: TopTreeSig (115 events)
Read data from file data/baggerTraining.pat for variables "_HT" "_Jet1Pt" "_DeltaRJet1Jet2" "_WTransverseMass"
Total number of points read: 1238
Training data filtered by class.
Points in class 0(1):   1123
Points in class 1(1):   115
Parsing File: data/baggerValidation.pat
Found file: /home/root_files/single_top/TopPhysDPDMaker/14.05.006/EarlyData/CombineSigBkg/Topology.SingleTop.1405006.FDR2.Electron.Validation.root start: 0 end: -1 class: 0 weight: 1
Reading File: /home/root_files/single_top/TopPhysDPDMaker/14.05.006/EarlyData/CombineSigBkg/Topology.SingleTop.1405006.FDR2.Electron.Validation.root for Tree: TopTreeBkg (1268 events)
Reading File: /home/root_files/single_top/TopPhysDPDMaker/14.05.006/EarlyData/CombineSigBkg/Topology.SingleTop.1405006.FDR2.Electron.Validation.root for Tree: TopTreeSig (119 events)
Read validation data from file data/baggerValidation.pat for variables "_HT" "_Jet1Pt" "_DeltaRJet1Jet2" "_WTransverseMass"
Total number of points read: 1387
Validation data filtered by class.
Points in class 0(1):   1268
Points in class 1(1):   119
Optimization criterion set to Gini index  -1+p^2+q^2 
Monitoring criterion set to Fraction of correctly classified events 
Per-event loss set to Quadratic loss (y-f(x))^2 
Decision tree initialized with minimal number of events per node 6
Decision tree will resample at most 4 features.
Using a Topdown tree.
Classes for Bagger are set to 0(1) 1(1)
Bagger initialized with classes 0(1) 1(1) with cycles 1100
Validation FOM=0.0970811 at cycle 5
...
Bagger finished training with 1100 classifiers.
Feeder storing point 0 out of 1238
Feeder storing point 1000 out of 1238

Run Training and Validation Separately

Run Training Only

The only difference between running training separately and running training with validation, as shown above is the lack of the command line option -t data/baggerValidation.pat.
/usr/local/bin/bin/SprBaggerDecisionTreeApp  \
  -n 1100 -l 6 -s 33 -g 1 -y '0:1' -d 5\
  -f bagger.spr -o baggerOutputTraining.root \
  data/baggerTraining.pat \

The output file baggerOutputTraining.root is the same as baggerOutput.root produced above.

Run Validation Only

Running validation only is done by
/usr/local/bin/bin/SprBaggerDecisionTreeApp  \
  -n 1100 -l 6 -s 33 -g 1 -y '0:1' -d 5\
  -r bagger.spr -o baggerOutputValidation.root \
  data/baggerValidation.pat \

The option -r bagger.spr resumes running using the info stored in bagger.spr.

-- PatRyan - 16 Apr 2009
Edit | Attach | Print version | History: r7 | r4 < r3 < r2 < r1 | Backlinks | View wiki text | Edit WikiText | More topic actions...
Topic revision: r3 - 20 Apr 2009, PatRyan
 

This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback