OpenSSL Performance
OpenSSL is widely used to secure communications between servers. This is an important protocol in many server stacks. We first look at our sign tests:
Here are the verify results:
Again, we see a simple equation: Zen+ microarchitecture + high clock speeds + 32 cores = great performance.
UnixBench Dhrystone 2 and Whetstone Benchmarks
Some of the longest-running tests at STH are the venerable UnixBench 5.1.3 Dhrystone 2 and Whetstone results. They are certainly aging, however, we constantly get requests for them, and many angry notes when we leave them out. UnixBench is widely used so we are including it in this data set. Here are the Dhrystone 2 results:
Here are the whetstone results:
Again, we see an absolutely massive performance set where single-threaded figures are still very competitive. If you have users with Intel Xeon E5-1600 V3 era workstations, performance has increased by so much that it can easily warrant an upgrade.
GROMACS STH Small AVX2/ AVX-512 Enabled
We have a small GROMACS molecule simulation we previewed in the first AMD EPYC 7601 Linux benchmarks piece. In Linux-Bench2 we are using a “small” test for single and dual socket capable machines. Our medium test is more appropriate for higher-end dual and quad socket machines. Our GROMACS test will use the AVX-512 and AVX2 extensions if available.
Here we see solid performance, well beyond the AMD EPYC 7551P and dual Intel Xeon Gold 5119T setups. At the same time, this result is being limited by AVX2 and the Threadripper 2990WX memory connectivity situation.
Chess Benchmarking
Chess is an interesting use case since it has almost unlimited complexity. Over the years, we have received a number of requests to bring back chess benchmarking. We have been profiling systems and are ready to start sharing results:
Getting this result was surprisingly challenging. This is one of the workloads that likes to assume that you have locally attached memory on each NUMA node. As such, it required a bit of script editing to get this to work on the Threadripper 2990WX. The performance was slightly below the AMD EPYC 7551P which is closer to a worst case 32 core result when the Infinity Fabric is being taxed. The on-package connectivity does have an impact. At the same time, the Threadripper 2990WX is still the performance value leader.
Next we are going to talk about the market positioning of the AMD Ryzen Threadripper 2990WX and then give our final thoughts.
Boxx sure knows how to make this system as expensive and slow as possible.
Just build a similar system with 2 GPU’s, DDR4-3200 CL14, 3 970 pro’s and some spinning disks that beats the sh*t out of this system.
I’d bet your build performs better because the Boxx system comes with ECC memory, the fastest of which is 2667/CL19. They’re trading off absolute performance for stability, which many workstation users require.
There is no mention of ECC memory on the Boxx website nor on STH so I wouldn’t know.
What I do know from the boxx website is that they charge a lot of money for the used components $ 2,582 for 96 GByte of memory, that is almost $27 per GigaByte, where ECC-UDIMM DDR4-2667 CL19 cost less than half per GigaByte.
https://www.boxx.com/products/workstations/t-class
“32GB DDR4-2666 REG ECC”
But yeah, that’s a ridiculous premium over off-the-shelf components for just to have a central vendor for business support.
APEXX T3
128GB DDR4-2666MHz
APEXX T3 (1st Gen)
32GB DDR4-2666 REG ECC
Since this thread is about the second gen. there is still no mention about ECC.
Would love to see the bios setting of the cheap taichi board (use Fatal1ty myself).
Great stuff !
One question: unlike Epyc, Threaripper 2990X has direct access to memory for only 2 cores among the four zen cores. Is the linux release you’re using is optimized to automatically fork processes if asked by user applications that are memory hungry on the relevant cores (those with directe acces to memory) ?
I’ve read elsewhere that microsoft is going to supply a patch to handle this kind of situation.
Samsung makes unbuffered b-die ECC, anyone building TR workstations really should use that over micron.
Yep 32 GB UDIMM-ECC: Samsung M391A4G43MB1-CTD