Introduction

This describes how to make the DPDs necessary for our analysis. The DPDs are made with the TopPhysDPDMaker, which relies on the TopViewTools package. This is the successor to TopView for ATHENA versions 13 and greater. The instructions are for the MSU cluster at CERN.

Workspace Setup

Introduction

Information for the workspace setup was taken from following:

Top Physics DPD Maker - Getting Started

ATLAS Workbook - Account Setup

Setup CMT Directory Structure

Create Directories

The compiled code talke up about 15M of disk space and an output root file containing 1000 events takes up another 15M. If you have enough room under AFS, it is better to put the code there since this allows the code to be accessed from lxplus and therefore jobs can be submitted to lxbatch. The data disks on the MSU cluster cannot be seen from lxplus. If you do not have enough AFS space put the code on the data disk. You can still run jobs locally and submit them to the GRID.

The current version of the code at the time this note was written was 13.0.40.
$> mkdir -p TopPhysDPDMaker/cmthome TopPhysDPDMaker/testarea/13.0.40/

Setup CMT

Create a file called "requirements" in the cmthome directory and paste the code listed below into this file.

$> cd TopPhysDPDMaker/cmthome
$> emacs requirements &

Code which goes into requirements (cut and paste):
set CMTSITE CERN
set SITEROOT /afs/cern.ch
macro ATLAS_DIST_AREA ${SITEROOT}/atlas/software/dist

# use optimised version by default
apply_tag  opt
apply_tag  runtime 
# simple workarea directories
apply_tag  simpleTest
apply_tag  oneTest 
apply_tag  setup
apply_tag  32

# Set the location of your preferred development area
macro ATLAS_GROUP_AREA "/afs/cern.ch/atlas/groups/PAT/Tutorial/EventViewGroupArea/EVTags-13.0.40.2"

macro ATLAS_TEST_AREA "" 13.0.40 "/work/jever/pryan/atlas/TopPhysDPDMaker/testarea/13.0.40/"

use AtlasLogin AtlasLogin-* $(ATLAS_DIST_AREA)

Edit the line
 macro ATLAS_TEST_AREA "" 13.0.40 "/work/jever/pryan/atlas/TopPhysDPDMaker/testarea/13.0.40/"
to correspond to the present version of the code and your directory.

Setup CMT Environment

In the cmthome directory do the following to get version v1r20p20080222 of CMT. Note that this command only works on the CERN cluster.
$> source /afs/cern.ch/sw/contrib/CMT/v1r20p20080222/mgr/setup.sh

Copy the required scripts into the cmthome directory.
$> cmt config

Set Group Area

This command will has to be executed in each new shell before you start working
$> source setup.sh -tag=13.0.40,groupArea

Get and Compile Packages

Checkout packages

Go to testarea/13.0.40 directory
$> cd ../testarea/13.0.40

Get packages
$> cmt co -r TopPhysTools-13-00-40-06 PhysicsAnalysis/TopPhys/TopPhysTools
$> cmt co -r TopPhysDPDMaker-00-00-10 PhysicsAnalysis/TopPhys/TopPhysDPDMaker

Compile Packages

The Group Area must be setup before compiling and running (see above). If you have not done so already,
$> source setup.sh -tag=13.0.40,groupArea

For TopPhysTools
cd PhysicsAnalysis/TopPhys/TopPhysTools/cmt
source setup.sh
cmt bro make

For TopPhysDPDMaker
cd PhysicsAnalysis/TopPhys/TopPhysDPDMaker/cmt
source setup.sh
cmt bro make

Make D3PDs

Introduction

TopPhysDPDMaker makes D3PDs, which are flat nTuples similar to those produced previously with TopView. The information in this section was taken from the following pages:

TopPhysDPDMaker Page

D3PD Info Page

IN3P3 DPD Tutorial

ATLAS Physics Workbook - Batch Jobs

Produce D3PDs locally

The Group Area must be setup before making the D3PD (see above). If you have not done so already,
$> source setup.sh -tag=13.0.40,groupArea

In order to ensure that you have completed the above steps correctly, you should produce an example D3PD. Edit the file ElectroweakD3PD_topOptions.py by commenting out the line
if not "InFileNames"        in dir():      InFileNames = glob.glob("/tmp/ashibata/fdr08*")
and inserting the line
if not "InFileNames"        in dir():      InFileNames = ['/afs/cern.ch/atlas/maxidisk/d66/AOD.019335._00001.pool.root.4']

Run athena:
$> athena share/ElectroweakD3PD_topOptions.py &> ExampleD3PD_local.log

It will take approximately 6 minutes to run over the 1000 events using the machines in the MSU cluster at CERN. A root file called Electroweak.D3PD.aan.root will be produced. To open the root file using root, Version 5.19.04 or later must be used. Earlier versions cannot handle variables stored in a vector. Open up a TBrowser in root and make sure that the variables are filled.

Produce D3PDs using lxbatch

Create Executable File

Put the following in a file called, for example, lxbatchSub.sh in the run directory.

#!/bin/bash
source /afs/cern.ch/user/p/pryan/atlas/TopPhysDPDMaker/cmthome/setup.sh -tag=13.0.40,groupArea
cd /afs/cern.ch/user/p/pryan/atlas/TopPhysDPDMaker/testarea/13.0.40/PhysicsAnalysis/TopPhys/TopPhysDPDMaker/
athena.py share/ElectroweakD3PD_topOptions.py

This file must be executable
$> chmod +x lxbatchSub.sh

Essential Batch System Commands

The commands necessary to submit and list jobs on the batch system are given below. For a more complete listing see the LSF Documentation.

Submit Jobs

$> bsub -q queue ExecutableFile
The valid queues are: 8nm (8 minutes), 1nh (1 hour), 8nh (8 hours), 1nd (1 day) and 1nw (1 week). The default is queue is 8nm. The n stands for normalized.

List jobs

To see what jobs are running (list unfinished jobs):
$> bjobs

To list all jobs (finished and unfinished):
$> bjobs -a
Note that after a certain time the jobs are removed from the list.

To get more info about a job:
$> bjobs -l JobID

Produce an Example D3PD

Submit the job using the batch commands
$> bsub -q 1nh lxbatchSub.sh

An email notification will be sent when your job is finished. The root file Electroweak.D3PD.aan.root should be produced in the TopPhysDPDMaker directory. In addition, a directory with a name such as LSFJOB_2896624 will be produced in the run/ directory, where the number in the directory name corresponds to the JobID in the batch system. This directory will contain a file named STDOUT which includes all the text that your job would have normally printed to the screen if run interactively.

Produce Many D3PDs

To automate the production of D3PDs with lxbatch, it is necessary to edit the job options file to correspond to each input and output file. In addition, output root files must be moved to the lxplus tmp space because your AFS space will quickly be filled. A script doing this is coming shortly.

Produce D3PDs on the Grid

Grid Setup

Before using the Grid you must have a Grid Certificate and install it properly. The ATLAS Workbook Starting on the Grid page describes how to do this.

Use this Grid setup script which gives you a validity period of 90 hours instead of the default 12 hours.
$> source GridSetupAndProxy.sh

Submit jobs to Grid

The Group Area must be setup before submitting jobs (see above). If you have not done so already,
$> source setup.sh -tag=13.0.40,groupArea

In order to submit jobs, you must have a directory TopPhysDPDMaker/testarea/13.0.40/InstallArea/python' in your work area. Failure to have such a directory will causes errors such as
OSError: [Errno 2] No such file or directory: '/afs/cern.ch/user/p/pryan/atlas/TopPhysDPDMaker/testarea/13.0.40/InstallArea/python'

To submit an example job to PANDA using pathena,
$> pathena -c "doStream=True" TopPhysDPDMaker/ElectroweakD3PD_topOptions.py  --inDS fdr08_run1.0003077.StreamEgamma.merge.AOD.o1_r12_t1 --outDS your_name_and_job_name --site ANALY_MWT2
The option inDS specifies the input Data Set and the option --outDS specifies the output name of the Data Set. It seems that files stored under AFS are not valid as Input Data Sets. You must specify a data set already on the Grid. The option --site specifies the Grid Site. A full list of Grid Sites is listed http://pandamon.usatlas.bnl.gov:25880/server/pandamon/query?dash=analysis here The doStream=True option is specific to FDR stream files.

Check Status of Jobs on the Grid

Job status can be checked on the Panda Monitor. You can use the search function with the corresponding JobID and PandaID of your jobs (these are printed to STDOUT when you submit the job) or check the jobs according to the status. The meaning of each status is described here.

Get Files of the Grid

First you must check the existence of a data set (wildcards are allowed)
$>  dq2_ls user.PatrickRyan.test.DP3D.001
user.PatrickRyan.test.DP3D.001

Then you must check which root files are in the dataset
$> dq2_ls -g user.PatrickRyan.test.DP3D.001
user.PatrickRyan.test.DP3D.001   Total: 3
    user.PatrickRyan.test.DP3D.001.AANT0._00001.root
    user.PatrickRyan.test.DP3D.001.META0._00001.root
    user.PatrickRyan.test.DP3D.001._12453403.log.tgz

The files are retrieved using the dq2_get command:
dq2_get -r user.PatrickRyan.test.DP3D.001 user.PatrickRyan.test.DP3D.001.AANT0._00001.root
The -r option is for remote file transfer. The AANT0 root file is the one that contains the info we are interested in analyzing.

More Info

My jobs were failing on the grid. The datasets could not be found and I was getting the following errors in my log files:
11 Jun 2008 11:01:55| !!FAILED!!2999!! Missing input file(s) in xml: [AOD.017323._00013.pool.root.2, AOD.017323._00014.pool.root.1, AOD.017323._00017.pool.root.1, AOD.017323._00019.pool.root.2]
11 Jun 2008 11:01:55| Number of transferred root files    : 4
11 Jun 2008 11:01:55| Number of transferred non-root files: 1
11 Jun 2008 11:01:55| Mover get_data finished (failed)
11 Jun 2008 11:01:55| Will return fail code = 117, pilotErrorDiag = Missing input file(s) in xml: [AOD.017323._00013.pool.root.2, AOD.017323._00014.pool.root.1, AOD.017323._00017.pool.root.1, AOD.017323._00019.pool.root.2]

See the Top FDR Page for more info.

-- PatRyan - 30 May 2008
Topic revision: r15 - 16 Oct 2009, TomRockwell
 

This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback