Installation¶
Installing from source usually works. Here, we provide two examples that install MAMMOTH in Nvidia and AMD platforms.
Puhti / Mahti¶
This first example install MAMMOTH on clusters (Puhti/Mahti) with Nvidia GPUs.
Install¶
In the login node, create a directory in the projappl
linked to our project to host the shared python dependencies, and install the code base & dependencies there:
# where to install the necessary python packages
PROJECT=<your_project_name_is_your_account>
ENV_DIR="/projappl/${PROJECT}/test"
# where the codebase was copied to
CODE_DIR="/scratch/${PROJECT}/path/to/mammoth"
# set up variables & modules
module load pytorch
mkdir -p $ENV_DIR
export PYTHONUSERBASE=$ENV_DIR
#install dependencies
cd $CODE_DIR
pip3 install -e . --user
# optionally, to make sure that other people can access this install:
chmod -R 777 $ENV_DIR
chmod -R 777 $CODE_DIR
Run¶
In slurm job scripts, update environment variables to get python to run your code properly:
ENV_DIR="/projappl/${PROJECT}/test"
CODE_DIR="/scratch/${PROJECT}/path/to/mammoth"
module load pytorch
export PYTHONUSERBASE=$ENV_DIR
# note: this overwrites the path, you can also try appending this subdirectory instead
export PYTHONPATH=$ENV_DIR/lib/python3.9/site-packages/
srun python3 -u $CODE_DIR/train.py ...
LUMI¶
The GPU partition of LUMI supercomputer is AMD-based. This example uses python virtual environment without container. But as recommended by the LUST team of LUMI, better to use container. An instruction to install MAMMOTH under Singularity container will release in the future.
Install¶
Start an interactive session
srun --account="$PROJECT" --partition=dev-g --ntasks=1 --gres=gpu:mi250:1 --time=2:00:00 --mem=25G --pty bash
Load modules:
module load cray-python module load LUMI/22.08 partition/G rocm/5.2.3 module use /pfs/lustrep2/projappl/project_462000125/samantao-public/mymodules module load aws-ofi-rccl/rocm-5.2.3
Create virtual environment
python -m venv your_vevn_name
and activate itsource your_venv_name/bin/activate
.Install pytorch:
python -m pip install --upgrade torch==1.13.1+rocm5.2 --extra-index-url https://download.pytorch.org/whl/rocm5.2
Install mammoth
cd /pfs/lustrep1/projappl/${PROJECT}/${USER}/mammoth pip3 install -e . pip3 install sentencepiece==0.1.97 sacrebleu==2.3.1
Run¶
You can train in slurm job scripts, such as
srun python -u $CODE_DIR/train.py ...
For more details, we refer to the tutorial section.