Discover (NASA-NCCS)
Platform user guide
General description
Resource manager -
SLURMLaunch methods (per platform ID)
nccs.discover*-SRUN
Configuration per node (per platform ID)
nccs.discover(676 nodes for Scalable Unit 16)48 CPU cores
192 GB of memory
nccs.discover_gpu_a100queue (12 nodes for Scalable Unit 16)48 CPU cores
4 GPUs (NVIDIA A100)
512 GB of memory
Setup execution environment
Python virtual environment
Create a virtual environment with venv:
module load python/GEOSpyD/Min24.1.2-0_py3.12
python3.12 -m venv ve.rp
source ve.rp/bin/activate
Install RADICAL-Pilot after activating a corresponding virtual environment:
pip install radical.pilot
Launching script example
Launching script (e.g., rp_launcher.sh) for the RADICAL-Pilot application
includes setup processes to activate a certain execution environment and
launching command for the application itself.
#!/bin/sh
# - pre run -
module load python/GEOSpyD/Min24.1.2-0_py3.12
source ve.rp/bin/activate
export RADICAL_PROFILE=TRUE
# for debugging purposes
export RADICAL_LOG_LVL=DEBUG
export RADICAL_REPORT=TRUE
# - run -
python <rp_application>
Execute launching script as ./rp_launcher.sh or run it in the background:
nohup ./rp_launcher.sh > OUTPUT 2>&1 </dev/null &
# check the status of the script running:
# jobs -l
Note
If you find any inaccuracy in this description, please, report back to us by opening a ticket.