Table of Contents |
---|
...
Austin's own Advanced Micro Devices (AMD) has most generously donated a number of GPU-enabled servers to UT.
While it is still true that AMD GPUs do not support as many 3rd party applications as NVIDIA, they do support many popular Machine Learning (ML) applications such as TensorFlow, PyTorch, and AlphaFold, and Molecular Dynamics (MD) applications such as GROMACS, all of which are installed and ready for use.
Our recently announced AMD GPU pod is available for instructional use and for research use for qualifying UT-Austin affiliated PIs. To request an allocation, contact us at rctf-support@utexas.edu, and provide the UT EIDs of those who should be granted access.
...
The AlphaFold protein structure solving software is available on all AMD GPU servers. The /stor/scratch/AlphaFold directory has the large required database, under the data.4 sub-directory. There is also an AMD example script /stor/scratch/AlphaFold/alphafold_example_amd.shand an alphafold_example_nvidia.sh script if the POD also has NVIDIA GPUs, (e.g. the Hopefog pod). Interestingly, our timing tests indicate that AlphaFold performance is quite similar on all the AMD and NVIDIA GPU servers.
On AMD GPU servers, AlphaFold is implemented by a run_alphafold.py Python script inside a Docker image, See the run_alphafold_rocm.sh and run_multimer_rocm.sh scripts under /stor/scratch/AlphaFold for a complete list of options to that script.
Pytorch and TensorFlow
Two Python scripts are located in /stor/scratch/GPU_info that can be used to ensure you have access to the server's GPUs from TensorFlow or PyTorch. Run them from the command line using time to compare the run times.
...
- Use top to monitor running tasks (or top -i to exclude idle processes)
- commands while top is running include:
- M - sort task list by memory usage
- P - sort task list by processor usage
- N - sort task list by process ID (PID)
- T - sort task list by run time
- 1 - show usage of each individual hyperthread
- they're called "CPUs" but are really hyperthreads
- this list can be long; non-interactive mpstat may be preferred
- htop is another popular program for monitoring running processes
- Use mpstat to monitor overall CPU usage
- mpstat -P ALL to see usage for all hyperthreads
- mpstat -P 0 to see specific hyperthread usage
- Use free -g to monitor overall RAM memory and swap space usage (in GB)
- Use rocm-smi to see GPU usage
...
- ROCm Video series
- https://community.amd.com/t5/instinct-accelerators-blog/rocm-open-software-ecosystem-for-accelerated-compute/ba-p/418720
- Especially the Introduction to AMD GPU Hardware: Link
- Provides hardware background and terminology used throughout other guides
- Also
- AMD ROCm resources Learning Center: https://developer.amd.com/resources/rocm-resources/rocm-learning-center/
...