Tip | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||
Use our summer school reservation (CoreNGS-Tue) when submitting batch jobs to get higher priority on the ls6 normal queue today.
Note that the reservation name (CoreNGS-Tue) is different from the TACC allocation/project for this class, which is OTH21164. |
...
Here is a comparison of the configurations and ls6 and stampede2. As you can see, stampede2 stampede3. stampede3 is the newer (and larger) cluster, just launched in 2017, but ls6, launched om 2022, has fewer but more powerful nodes2024; ls6 lwas launched in 2022.
ls6 | stampede2stampede3 | |
---|---|---|
login nodes | 3 128 cores each | 64 28 96 cores each |
standard compute nodes | 560 AMD Epyc Milan processors
4,200 KNL (Knights Landing) processors 68 | 560 Intel Xeon "Sapphire Rapids" nodes
1,736 SKX (Skylake) processors1060 Intel Platinum 8160 "Skylake" nodes
224 Intel Xenon Platinum 8380 "Ice Lake" nodes
|
GPU nodes | 16 AMD Epyc Milan processors 128 cores per nodnode 2x NVIDIA A100 GPUs | 20 GPU Max 1550 "Ponte Vecchio" nodes 96 cores per node 4x Intel GPU Max 1550 GPUs |
batch system | SLURM | SLURM |
maximum job run time | 48 hours, normal queue 2 hours, development queue | 96 48 hours on KNL GPU nodes, normal queue48 24 hours on SKX other nodes, normal queue 2 hours, development queue |
...
Note the use of the term virtual core on stampede2. Compute cores are standalone processors – mini CPUs, each of which can execute separate sets of instructions. However modern cores may also have hyperthreading enabled, where a single core can appear as more than one virtual processor to the operating system (see https://en.wikipedia.org/wiki/Hyper-threading). For example, stampede2 nodes have 2 or 4 hyperthreads (HTs) per core. So KNL nodes with 4 HTs for each of the 68 physical cores, have a total of 272 virtual cores.
Threading is an operating system scheduling mechanism for allowing one CPU/core to execute multiple computations, seemingly in parallel.
The writer of a program that takes advantage of threading first identifies portions of code that can run in parallel because the computations are independent. The programmer assigns some number of threads to that work (usually based on a command-line option) using specific thread and synchronization programming language constructs. An example is the the samtools sort -@ N option to specify N threads can be used for sorting independent sets of the input alignments.
If there are multiple cores/CPUs available, the operating system can assign a program thread to each of them for actual parallelism. But only "seeming" (or virtual) parallelism occurs if there are fewer cores than the number of threads specified.
Suppose there's only one core/CPU. The OS assigns program thread A to the core to run until the program performs an I/O operation that causes it to be "suspended" for the I/O operation to complete. During this time, when normally the CPU would be doing nothing but waiting on the I/O to complete, the OS assigns program thread B to the CPU and lets it do some work. This threading allows more efficient use of existing cores as long as the multiple program threads being assigned do some amount of I/O or other operations that cause them to suspend. But trying to run multiple compute-only, no-I/O programs using multiple threads on one CPU just causes "thread thrashing" -- OS scheduler overhead when threads are suspended for time, not just I/O.
The analogy is a grocery store where there are 5 customers (threads). If there are 5 checkout lines (cores), each customer (thread) can be serviced in a separate checkout line (core). But if there's only one checkout line (core) open, the customers (threads) will have to wait in line. To be a more accurate analogy, any checkout clerk would be able to handle some part of checkout for each customer, then while waiting for the customer to find and enter credit card information, the clerk could handle a part of a different customer's checkout.
Hyperthreading is just a hardware implementation of OS scheduling. Each CPU offers some number of "virtual cores" (hyperthreads) that can "almost" act like separate cores using various hardware tricks. Still, if the work assigned to multiple hyperthreads on a single core does not pause from time to time, thread thrashing will occur.
Software at TACC
...