nEXO Simulation with nexo-offline

Introduction

This is a brief note on running simulation with nexo-offline on slac clusters. It assumes a basic familiarity with bash scripting.

nexo-offline

nexo-offline is the simulation framework designed to simulate the nEXO experiment for particles and their interaction. It is mostly written in [[C++]] with some parts in [[python programming language]]. Being written in [[C++]] this project needs to be compiled before we can use them. It has few other dependencies

  1. ROOT: [[ROOT]] is a software framework developed by [[CERN]]. It helps with compactly storing large dataset in a file format called a root. It alos has may physics analysis feature that are used in simulation and analysis of large data.

  2. Geant4: Geant4 is also developed by [[CERN]], which is the actual simulation framework which handles the particle generation, interaction and the geometry of detector. In a sense nexo-offline is itself an extension of Geant4 for nEXO specific geometry and particles.

  3. VGM: It is a geometry management package written in c++ and Geant4 depends on it as well as nexo-offline.

  4. SNiPER Framework: It is an open source task scheduler for large physics projects. It is written in C++ and provides python interfaces to manage and dispatch multiple tasks in a simulation process.

  5. Boost: Boost is a widely used [[C++]] library, that is used for various programming features, scientific functions etc for convenience.

Running nEXO particle simulation

To run particle simulation in nEXO we need to supply a macro file, which is a superset of [[geant4]] macro file. Among others there is a python script within nexo-offline called RunDetSim_new.py (Run detector simulation). Using this is simply doing:

python RunDetSim_new.py --evtmax <N> --mac <mac_file> --output <opfile> ...

This python script will use the compiled [[nexo-offline]] package to run simulation for the configuration provided in <mac_file> and it generates <N> events and saves in <opfile>. To get the exact parameters taken by this script we can look at this script or do

python RunDetSim_new.py --help

It has many other parameters that it takes which is shown by the --help switch.

Analysis with the simulation

Once the simulation is done, and we get simulation file in <opfile> we have to do analysis. The analysis may have multiple steps based on the type of analysis that we need.

Compiling nexo-offline

Compilation of a large [[C++]] package can be daunting because it needs to find all the dependency in their correct version at the expected location. [[Geant4]], which mentioned earlier, is itself a sizeable [[C++]] package that needs compilation. Same goes for [[SNiPER Framework]] which have its own dependency. So in principle we can collect all the dependencies and compile the individual compiled components and then compile nexo-offline, in fact a lot of people within the collaboration do this (including myself). But collecting all these dependencies and compiling them is time consuming and error prone.

nEXO provides so called singularity image with all the required dependencies compiled within it. Using singularity image we wouldn’t have to worry about collecting or compiling correct version of dependencies.

Using singularity image

Singularity image is a package file with extension .sif which can be used to create a sort of virtual environment within an operating system. Lets say the host system has python2.7. And nexo-offline requires python3.7, so we will find a singularity image in slac cluster which python3.7 in it. The slac clusters like centos7 and rhel6-64 have singularity executable. So to initiate this virtual environment we have to use the singularity image like:

     singularity shell <full_path_to/image.sif>

Much like doing ssh into a remote cluster, this will take to a new linux environment and by default the terminal prompt looks like:

    user@centos7:~$ python --version
    2.7.15
    user@centos7:~$ singularity shell <full_path_to/image.sif>
    Singularity> cd /nfs # notice the prompt change here
    no such directory /nfs
    Singularity> python --version 
    3.7.7

This prompt is also bash (by default) prompt and we can run any linux command we would normally do. Notice how the python version in the host is 2.7.15 while within the singularity image it is different( assuming the image used had 3.7.7). But now this prompt will not have access to the files of programs in the host (centos7). There is a way to access the files in host which is by mounting the host path into the singularity environment. The parameter -B will take the path in host system and mount into the singularity image which we can then use within the image.

    user@centos7:~$ singularity shell -B /nfs/slac/g/exo/username <full_path_to/image.sif>
    Singularity> cd /nfs
    Singularity> # This time we are within /nfs directory of the host

We can take a look at various parameters that singularity takes in their documentation . The singularity image files are available in the slac cluster and depending on the image the software wrapped by them might be different.

Singularity command from outside

It is sometimes inconvinient to always mount the singularity image and work within it. Because the configuration between images might differ and we don’t have all the programs in the host available within the image. So we can use singularity exec command to execute command inside singularity image from the host like:

    user@centos7:~$ singularity exec <full_path_to/image.sif> python --version
    3.7.7
    user@centos7:~$ python --version
    2.7.15
    user@centos7:~$ 

In the above example the command python --version is run inside the singularity image. Everything after the image name <full_path_to/image.sif> is run within the singularity image.

Compiling nexo-offline within singularity image

There is a very clear documentation on how to compile the nexo-offline package within singularity. We might have to use different singularity image file than the on given in the documentation, but everything else should just work as described.

Running simulation within singularity image

Once nexo-offline is compiled we can run simulations within the singularity environment. There is a documentation on using the yaml cards to run simulation. There are example .yaml cards which can be modified as per need. One example of submitting a simulation job is

Singularity> python /path/to/SubmitSimulation.py -c path/to/yamlfile.yaml -s

We can take a lookt at the SubmitSimulation.py this will take the yaml card as parameter and produce bash scripts to run the actual steps of analysis, (Simulation, Reconstruction etc). The bash scripts are then finally run. So yaml card here is a way of organizing all the simulation and analysis parameter at a single location. This will create directories for creating bash jobs and put scripts in them, by default, the bash jobs are in output/jobs directory and the corresponding nexo-offline macros are in output/g4, its also the location where the simulation output .root files will be written. It is instructive to look at the example yaml file , most of the configuration option here are commented nicely and the organization is intuitive.

As mentioned earlier, loging within singularity image might sometimes be limiting in terms of accessing files and programs in the host. So as explained earlier we can run command from the host within the singularity command without actually mounting the singularity image.

Running job from host without logging into singularity image.

We could submit the sub from host within singularity and still remain in the host like this:

    user@centos7:~$ singularity exec <full_path_to/image.sif> \
        python /path/to/SubmitSimulation.py -c path/to/yamlfile.yaml -s

In principle this would work, but it will throw error in this case because we still need to setup environment variables within singularity so that it will actually find the libraries and executables necessary to run SubmitSimulation.py script.

One way I get around this limitation is by creating a bash script that is run within singularity image, which will set all the environment variables, and running the bash script with the simulation job line in the bash script like:

    user@centos7:~$ cat singularity_run.sh
    #!/usr/bin/bash

    yaml=${1}
    source /opt/nexo/software/setup.sh
    source /opt/nexo/software/sniper-install/setup.sh
    source /path/to/compiled/nexo-offline-build/setup.sh


    python /path/to/SubmitSimulation.py -c ${yaml} -s

    user@centos7:~$ singularity exec <full_path_to/image.sif>\
        bash singularity_run.sh path/to/yamlfile.yaml

This will now run the bash script (singularity_run.sh) within the singularity image, the bash script will first run the scripts necessary to setup environment variables for all the libraries and programs. And then will run the python submit simulation line. In the above example the yaml card is passed as a parameter to bash script from the host OS (centos7)