================== Perlmutter (NERSC) ================== Platform user guide =================== https://docs.nersc.gov/systems/perlmutter/ General description =================== * Resource manager - ``SLURM`` * Launch methods (per platform ID) * ``nersc.perlmutter*`` - ``SRUN`` * Configuration per node (per platform ID) * ``nersc.perlmutter`` (3,072 nodes) * 128 CPU cores, each core has 2 threads (``SMT=2``) * 512 GiB of memory * ``nersc.perlmutter_gpu`` (1,792 nodes in total) * 64 CPU cores, each core has 2 threads (``SMT=2``) * 4 GPUs (NVIDIA A100) * 1,536 nodes with 40 GiB of HBM per GPU * 256 nodes with 80 GiB of HBM per GPU * 256 GiB of memory .. note:: Perlmutter uses the ``--constraint`` option in ``SLURM`` to specify nodes features (`SLURM constraint `_). RADICAL-Pilot allows to provide such features within a corresponding configuration file. For example, Perlmutter allows to request to run on up to 256 GPU nodes, which have 80 GiB of GPU-attached memory instead of 40 GiB (`Specify a constraint during resource allocation `_), thus the corresponding configuration should be updated as following: .. code-block:: bash mkdir -p ~/.radical/pilot/configs cat > ~/.radical/pilot/configs/resource_nersc.json <`_ Create a **virtual environment** with ``venv``: .. code-block:: bash export PYTHONNOUSERSITE=True module load python python3 -m venv ve.rp source ve.rp/bin/activate OR create a **virtual environment** with ``conda``: .. code-block:: bash module load python conda create -y -n ve.rp python=3.9 conda activate ve.rp Install RADICAL-Pilot after activating a corresponding virtual environment: .. code-block:: bash pip install radical.pilot # OR in case of conda environment conda install -c conda-forge radical.pilot Launching script example ======================== Launching script (e.g., ``rp_launcher.sh``) for the RADICAL-Pilot application includes setup processes to activate a certain execution environment and launching command for the application itself. .. code-block:: bash #!/bin/sh # - pre run - module load python source ve.rp/bin/activate export RADICAL_PROFILE=TRUE # for debugging purposes export RADICAL_LOG_LVL=DEBUG # - run - python Execute launching script as ``./rp_launcher.sh`` or run it in the background: .. code-block:: bash nohup ./rp_launcher.sh > OUTPUT 2>&1 `_.