/
GPU servers
GPU servers
The BRCF has access to several GPU servers. See:
AMD GPU servers
NVIDIA GPU servers
Common Resources
Testing GPU access
To verify GPU access, use either the rocm-smi (AMD GPU servers) or nvidia-smi (NVIDIA GPU servers).
GPU-enabled software
See server-specific pages for how to run TensorFlow, PyTorch and AlphaFold and GROMACS.
Sharing resources
Since there's no batch system on BRCF POD compute servers, it is important for users to monitor their resource usage and that of other users in order to share resources appropriately.
- Use top to monitor running tasks (or top -i to exclude idle processes)
- commands while top is running include:
- M - sort task list by memory usage
- P - sort task list by processor usage
- N - sort task list by process ID (PID)
- T - sort task list by run time
- 1 - show usage of each individual hyperthread
- they're called "CPUs" but are really hyperthreads
- this list can be long; non-interactive mpstat may be preferred
- Use mpstat to monitor overall CPU usage
- mpstat -P ALL to see usage for all hyperthreads
- mpstat -P 0 to see specific hyperthread usage
- Use free -g to monitor overall RAM memory and swap space usage (in GB)
- Use rocm-smi (AMD GPUs) or nvidia-smi (NVIDIA GPUs) to see GPU usage
, multiple selections available,
Related content
AMD GPU servers
AMD GPU servers
Read with this
AMD GPU pod servers
AMD GPU pod servers
More like this
Livestrong and Hopefog pod AMD servers
Livestrong and Hopefog pod AMD servers
More like this
Support and Maintenance
Support and Maintenance
More like this
POD Resources and Access
POD Resources and Access
More like this
How to participate in BRCF pods
How to participate in BRCF pods
More like this