NVIDIA GPU servers

Two BRCF research pods have NVIDIA GPU servers; however their use is restricted to the groups who own those pods.

Servers

Hopefog pod

hfogcomp04.ccbb.utexas.edu compute server on the Hopefog pod (Ellington/Marcotte):

Dell PowerEdge R750XA
dual 24-core/48-thread CPUs (48 cores, 96 hyperthreads total)
512 GB RAM
2 NVIDIA Ampere A100 GPUs w/32GB onboard RAM each

Wilke pod

wilkcomp03.ccbb.utexas.edu compute server on the Wilke pod:

GIGABYTE MC62-G40-00 workstation
AMD Ryzen 5975WX CPU (32 cores, 64 hyperthreads total)
512 GB RAM
1 NVIDIA RTX 6000 GPU

Resources

Tests

Use nvidia-smi to verify access to the server's GPUs

Two Python scripts are located in /stor/scratch/GPU_info that can be used to ensure you have access to the server's GPUs. Run them from the command line using time to compare the run times.

Tensor Flow
- time ( python3 /stor/scratch/GPU_info/tensorflow_example.py )
  - should take 30s or less with GPU, > 1 minute with CPUs only
  - this is a simple test, and on CPU-only servers multiple cores are used but only 1 GPU, one reason why the times are not more different
PyTorch
- time ( python3 /stor/scratch/GPU_info/pytorch_example.py )
  - takes ~30s or less to complete on wilkcomp03
  - takes ~1m to complete on hfogcomp04.

CUDA

These servers have both CUDA 11 and CUDA 12 installed