...
We have actually massively under-utilized stampede2 in this example by only using 8 cores. We ran the command using only 8 processors rather than the 68 48 we have available on our idev session. if we increase to 68 48 total processors and rerun the analysis, how long do you expect the command to take?
Expand |
---|
title | Modify the previous mapping command to re-run this analysis using all 68 cores. |
---|
|
You need to increase the -p , for "processors" option from 8 to 6848. Code Block |
---|
language | bash |
---|
title | click here to check your answer |
---|
collapse | true |
---|
| bowtie2 -t -p 6848 -x bowtie2/NC_012967.1 -1 SRR030257_1.fastq -2 SRR030257_2.fastq -S bowtie2/SRR030257.sam
|
Try it out and compare the speed of execution by looking at the times listed at the end of each command Expand |
---|
title | How much faster was it using all 68 processors? |
---|
| 8 processor took a little over 5 minutes, 68 processors took ~1 minute. Can you think of any reasons why it was ~ 5x faster rather than ~8x faster? — note the times here are incorrect but the principle is the same Expand |
---|
| Anytime you use multiprocessing correctly things will go faster, but even if a program can divide the input perfectly among all available processors, and combine the outputs back together perfectly, there is "overhead" in dividing things up and recombining them. These are the types of considerations you may have to make with your data: When is it better to give more processors to a single sample? How fast do I actually need the data to come back?
An additional note from the stampede2 user manual is that while there are 68 cores available, and each core is capable of hyperthreading 4 x processors per core using all 272 processors is rarely the go to solution. While I am sure that this is more rigorously and appropriately tested in some other manner, I ran a test using different numbers of processors with the following results: -p option | time (min:sec) |
---|
272 | 1:54 | 136 | 1:13 | 68 | 0:57 |
---|
34 | 1:14 | 17 | 2:25 | 8 | 5:12 | 4 | 9:01 | 2 | 18:13 | 1 | 35:01 |
Again while there are almost certainly better ways to benchmark this, there are 2 things of note that are illustrated here: - ~doubling the number of processors does not reduce the time in half, and while some applications may use hyperthreading on the individual cores appropriately, and assuming a program can/will actually makes things take longer.
- Working on your laptop (which likely has at most 4-8 processors available) would significantly increase the amount of time these tutorials take.
|
|
|
...