The tags at the end of dataset names encode information on the geometry, database, calibration, and reconstruction versions used. There are three very useful resources for understanding them.
- AMI Database – The Atlas Metadata Interface has a search feature which returns the reconstruction tags for a dataset. The tag information can then be retrieved. It is a good resource for quick information such as the geometry used and for determining what tags were used for given samples.
- Production Tag Database – The Monte Carlo Central Production database, which can be easily searched with the tag in question. It also returns exact command line options so that one could reproduce exactly a specific dataset or configuration. Unfortunately, its knowledge of data tags is lacking, but it is an excellent MC resource.
- GetCommand.py AMI=tag – In a release setup with AtlasProduction the GetCommand.py will return the exact transform and flags used to produce this tag. Therefore if one wishes to exactly duplicate a dataset or configuration this command is extremely useful and will return the exact transform as executed on the command line information. As noted on the link the list of tags is not yet complete but growing.
To setup an ATLAS grid environment on the Terriers the prescription is slightly different from lxplus computers. This is due to the 32-bit architecture, and python version compatibility. The correct commands and order are as follows:
- source /afs/cern.ch/user/a/angelos/ddmafs/DQ2Clients/setup_testing.sh py26
- voms-proxy-init -voms atlas
If one is starting from scratch, they should follow the detailed step by step instructions found here:
Briefly, the necessary pieces are:
- Obtain a US-Grid certificate, export it from your browser, and install it in your home area in a directory called .globus
- Join the ATLAS VO
- Prepare your grid certificate by generating usercert.pem and userkey.pem files.
- Initialize your user interface, described in the link for lxplus and above for the terriers. This must be done every time you want to use the grid.
Using the acmd.py command one can filter ESDs given a list of events. Here is a perscription for how to do that using the Grid. These commands have been tested and validated using release 126.96.36.199 on a lxplus node.
- Create a file called eventlist.txt containing the events you want to select. The format should have two columns the first is run number the second is event number. This file must be in the local directory where you will issue the prun command.
- Set up your release and grid environment. I have used 188.8.131.52
- Issue the following command in a directory containing the file eventlist.txt:
%>prun –exec “acmd.py filter-files -o filter.ESD.pool.root -s eventlist.txt \`echo %IN | sed ‘s/,/ /g’\`” –outputs filter.ESD.pool.root –athenaTag=184.108.40.206,AtlasProduction –nFilesPerJob 12 –dbRelease LATEST –inDS ESDDATASET –outDS OUTPUTDATASETNAME
Here you would like to change ESDDATASET to the dataset you wish to run on, and OUTPUTDATSETNAME to the output dataset name. Everything else should be the same.
- When the prun jobs have completed dq2-get the results using:
- cd to the directory of OUTPUTDATSETNAME
- Merge the output ESD files with the following command:
Merging_trf.py –omitvalidation=ALL inputESDFile=`ls user*.root* | tr -d ‘\n’ |sed ‘s/.rootu/.root,u/g’` –ignoreerrors=ALL autoConfiguration=everything outputESDFile=Filter.merge.ESD.pool.root
This final command will produce a single output file from all the subjob ESDs, it ignores ESDs with zero entries, and the time it takes to run is dependent on the number of ESDs. For run 156682 with 66 subjobs it took ~10min to run interactively on an lxplus node. In total this took ~40min from start to finish.