Amarel (Rutgers)

Platform user guide

https://sites.google.com/view/cluster-user-guide

General description

  • Resource manager - SLURM

  • Launch methods (per platform ID)

    • rutgers.amarel - SRUN

  • Configuration per node (per queue)

    • main queue (388 nodes)

      • 12-56 CPU cores

      • 64-256 GB of memory

    • gpu queue (38 nodes)

      • 12-52 CPU cores

      • 2-8 GPUs

      • 64-1500 GB of memory

    • mem queue (7 nodes)

      • 28-56 CPU cores

      • 512-1500 GB of memory

  • Other available queues

    • nonpre

      • jobs will not be preempted by higher-priority or owner jobs

    • graphical

      • specialized partition for jobs submitted by the OnDemand system

    • cmain

      • “main” partition for the Amarel resources located in Camden

Note

In order to be able to access Amarel cluster, you must be connected to Rutgers Virtual Private Network (VPN) with a valid Rutgers netid.

Note

Amarel uses the --constraint option in SLURM to specify nodes features (SLURM constraint). RADICAL-Pilot allows to provide such features within a corresponding configuration file. For example, if you want to select nodes with “skylake” and “oarc” features (see the list of Available compute hardware), please follow the next steps:

mkdir -p ~/.radical/pilot/configs
cat > ~/.radical/pilot/configs/resource_rutgers.json <<EOF
{
    "amarel": {
        "system_architecture": {"options": ["skylake", "oarc"]}
    }
}
EOF

Setup execution environment

Python virtual environment

Create a virtual environment with venv:

export PYTHONNOUSERSITE=True
module load python
# OR
#   module use /projects/community/modulefiles
#   module load python/3.9.6-gc563
python3.9 -m venv ve.rp
source ve.rp/bin/activate

Install RADICAL-Pilot after activating a corresponding virtual environment:

pip install radical.pilot

Launching script example

Launching script (e.g., rp_launcher.sh) for the RADICAL-Pilot application includes setup processes to activate a certain execution environment and launching command for the application itself.

#!/bin/sh

# - pre run -
module load python
source ve.rp/bin/activate

export RADICAL_PROFILE=TRUE
# for debugging purposes
export RADICAL_LOG_LVL=DEBUG

# - run -
python <rp_application>

Execute launching script as ./rp_launcher.sh or run it in the background:

nohup ./rp_launcher.sh > OUTPUT 2>&1 </dev/null &
# check the status of the script running:
#   jobs -l

Note

If you find any inaccuracy in this description, please, report back to us by opening a ticket.