Often when performing an analysis you often want to find a specific event file on the grid, but you only know its runnumber and eventnumber and not the name of the file or dataset. This page explains how to find that file (or multiple files) and how to make event displays with them in Atlantis. You might even want to retrieve multiple event files simultaneously, which exist in multiple different datasets, again with knowledge of only their run and event numbers. It's very difficult or even impossible to find those files using normal search methods (e.g. dq2-ls) on the grid; however a set of commands exists within the framework of pathena that allows for such a search.
In addition Pathena can automatically convert your events into the .xml file format required by Atlantis. The conversion process takes place on the grid and the output will be accessible to the user once the job is complete. Therefore Pathena does two things: it finds specific files given a list of run numbers and event numbers, and it converts those files to the appropriate format for Atlantis. Once the job has been completed, the user can use the usual commands (e.g. dq2-ls, dq2-get, etc) to find and download the event files to their machine. Finally, with the right settings (detailed below), the user can use Atlantis to run through their list of events to automatically make event displays files.
Methods and Setup
You will need a GRID certificate, Athena, Panda, and if you're making event displays Atlantis (and/or VP1 and Persint)
To set up Athena:
alias asetup='source $AtlasSetup/scripts/asetup.sh'
Both of these are documented in the AtlasOflineAnalysis
The general format of the Pathena command is:
pathena readesd.py --eventPickEvtList EVENTS.txt --eventPickStreamName STREAMNAME --eventPickDataType FILETYPE --outDS user.GRIDNAME.OUTPUTNAME --extOutFile JiveXML_*.xml, ESD_*.root
- readesd.py: the required job options file. Make sure it's in the same directory as your event file and the same directory from which you're running pathena. It can be downloaded here: readesd.py.txt
- eventPickList: specifies the name of the event file that contains a list of event numbers and run numbers. This should be in the same directory that you run the script.
- eventPickStreamName: specifies the stream name of the dataset that your run numbers and event numbers are associated with. This is the most limiting factor in your search because the search method only looks for run numbers/ event numbers from a particular stream; e.g. "physics_Egamma". So make sure that your list of events are all associated with the same stream. You'll get something like a "GUID" error if the events you search for don't exist, even if the run and event numbers you entered are correct.
- eventPickDataType: specifies the type of event file: ESD, AOD, RAW, etc. ESD is probably the way to go since VP1 runs on this format.
- outDS: specifies the name of the output file on the grid. The name of the output file is "user.GRIDNAME.OUTPUTNAME\". Don't forget the slash at the end when using "dq2-ls".
- extOutFile: specifies the type of output files.
A full list of Pathena commands can be found by typing
As an example for above, you can type this to look for four Z' events. These are the two highest invariant mass and two highest pt events in the Z' analysis (as of 7/18/11).
pathena readesd.py --eventPickEvtList ZPrimeEvents.txt --eventPickStreamName physics_Egamma --eventPickDataType ESD --outDS user.cwillis.Z_prime_output_01 --extOutFile JiveXML_*.xml, ESD_*.root
my "ZPrimeEvents.txt" looks exactly like this:
The run numbers are on the left and the event numbers are on the right. They are separated by a single space. Avoid semicolons or anything else at the end of the line. Also watch out for whitespace. If any of these parameters aren't correct, you'll get a "GUID" error. Sometimes the event numbers are a digit longer than what's above, that's ok.
If you want .xml files in your output make sure
is set to true in the readesd.py joboptions file. Also if you want to change the type output you can fiddle around with the joboptions file. I have it set to produce .xml files (for Atlantis) and .esd files (for VP1).
After you enter this command, Pathena will send the job to the grid to search for these files and begin converting them to the desired output. You can find out what datasets they come from by looking at your jobID on the Panda Job's page.
Getting your files
Once the job is complete, you can "dq2-get" it. The output files are stored together as a single dataset, so you can download the entire dataset, or individual files therein.
Once you have your files you can use Atlantis (or VP1 or Persint) to make Event Displays. This process is detailed on the EventDisplaySoftware
Here are a few useful links that I used a lot to figure this stuff out:
Atlantis Event Display
How to submit Athena jobs to Panda
Dan's Coding Tips 'N Tricks
Starting on the Grid (i.e. getting your certificate)
DQ2 Clients How To
- 18 Jul 2011