Combined Single-top Analysis with Early-Data at 10 TeV
Motivation
This Early-Data analysis combines the t-channel and Wt-channel single-top samples into a single signal sample. When an s-channel sample is available, it too will
be added into the combination.
The analysis aims to improve upon the 14 TeV combined single-top analysis by using less training variables (approximately 12 instead of 32) and including angular correlation
variables in the multivariate aspects of the analysis. It differes from the 10 TeV t-channel analysis by using an improved version of
JetProb, which was absent in the previous version
of the Athena analysis code.
MC Samples
TopD3PDMaker was used to produce D3PDs from v15.5.1 of Athena. The following samples were used.
- Signal
- 5502 (t-channel with parton showers from Pythia) Note that 5507 (t-channel with parton showers from Herwig) will be used instead of 5502 when it is made available.
- 5508 (Wt with parton showers from Herwig)
- Background
- 5500 (ttbar -> dilepton and ttbar -> lepton + hadron)
- 6280 6281 6282 6283 (Wbb)
- 7650 7651 7652 7653 7654 7655 7660 7661 7662 7663 7664 7665 7670 7671 7672 7673 7674 7675 (Z + Jets)
- 7680 7681 7682 7683 7684 7685 7690 7691 7692 7693 7694 7695 7700 7701 7702 7703 7704 7705 (W + Jets)
- 5985 5987 (Di-boson)
Full MC info is available on the
https://twiki.cern.ch/twiki/bin/view/AtlasProtected/TopReferences10TeV 10 TeV MC Page
Duplicate Events
Duplicate events were identified by their Event Number and counted in a histogram present in a topological variable skim file produced before
any event selection criteria were applied. Duplicate events were present only in the W + Jets files 7700, which contained 50 duplicate events,
and 7704, which contained 150 duplicate events. No events passed the event selection for 7700 and 50 events pass the event selection for 7704. Of
these 50 events, none are duplicates. The effect on the event weights (see next section) is less than 0.3%.
Since the influence of duplicate events is non-existent on the sample event selection and negligible on the event weights, duplicate events
were not removed from the input root files and event weights were not recalculated.
Event Weighting
Events were weighted in order to scale the MC samples to a luminosity of 100 pb
-1. Approximately 15,700 single-top events will be produced
at this luminosity, which should be a sufficient number for a successful analysis. Note that the MC samples were weighted to a luminosity of 200 pb
-1
in the 10 TeV t-channel note.
Weights were calculated according to
W = BR k σ L / N
0,
where BR is the branching ratio, k is the k-factor, σ is the cross section (LO or NLO), L is the luminosity to which the events are weighted, and N
0 is number
of generated events. Samples 5200 and 5502 include negatively weighted events resulting from the NLO calculation. All subsequent event weight calculations for these samples were
done after the negative weights were applied.
A table of event weight information for all samples is shown here
EventWeightsV1551.
Event Selection
Object Definitions
Objects were defined before any overlap removal or event selection occurred. The following object definitions were applied in the MSU analysis package.
A full list of object definitions, including those applied when creating the D3PD is located on the
https://twiki.cern.ch/twiki/bin/view/AtlasProtected/STtchan10TeV 10 TeV Analysis page.
- Muons
- isCombinedMuon() is True
- etcone30 < 6 GeV
- pT > 20 GeV and |eta| < 2.5
- Electrons
- etcone20 < 6 GeV
- No electron in gap region 1.37 < |eta| < 1.52
- pT > 20 GeV and |eta| < 2.47
- Jets
- pT > 30 GeV and |eta| < 5.0
- b-jets (different from 10 TeV t-channel analysis)
Overlap Removal
Event Selection
Signal vs. Background Comparison
Splitting and Merging of MC Samples
Training
Determination of training variables
The training and validation process was repeated multiple times with different variable lists, allowing for a few differences in the parameters (broad differences, so there would be too many classifiers generated). The boosted decision tree classifier appeared to give better performance in general for the shorter lists, so this classifier was focused on. The initial list was formed by looking at plots of the variables after preselection (without the wtmass cut, at that time). These can be see scaled (each histogram to its area) and unscaled at
http://hep.pa.msu.edu/people/jenny/SBplots_1551_scaled.html ScaledPlots,
http://hep.pa.msu.edu/people/jenny/SBplots_1551.html UnScaledPlots.
Brief Code Explanations:
--
PatRyan - 03 Dec 2009