Today we have benchmarks of dual Intel Xeon E5-2695 V4 processors. Each Intel Xeon E5-2695 V4 has 18 core, 36 thread processor with 45MB L3 cache. With 2.1GHz base clock speeds and turbo boost speeds up to 3.3GHz, these chips are a significant upgrade to the 14C/ 28T E5-2695 V3 which sported only 35MB L3 cache. We have published a number of benchmarks across different dual socket Intel Xeon E5-2600 V4 configurations already which we have linked below.
Test Configuration
Our test platform was a standard EATX motherboard upgraded for Xeon E5 V4 support via a simple BIOS upgrade. This is one of the NVMe servers we use in the Fremont colocation that we brought offline and upgraded to the V4 part.
- CPU: Intel Xeon E5-2695 V4
- Chassis: HPE ProLiant DL380 Gen 9
- Memory: 64GB – 4x Samsung 16GB DDR4 2400MHz ECC RDIMMs
- SSD: 1x Intel DC S3700 400GB, 4x Intel DC P3600 800GB
- Operating System: Ubuntu 14.04.3 LTS
This was an exciting comparison point from us as it is one of the first HPE machines we have published benchmark results for.
The big question is how will the HPE ProLiant DL380 Gen9 fare with two of Intel’s newest processors?
Intel Xeon E5-2695 V4 Benchmarks
For our testing we are using Linux-Bench scripts which help us see cross platform “least common denominator” results. We are using gcc due to its ubiquity as a default compiler. One can see details of each benchmark here. We are likely going to update the Linux-Bench in the near future with a few new tests as well as an even simpler to use/ faster revision, but for now, we are using our old Ubuntu 14.04.3 LTS version.
Python Linux 4.4.2 Kernel Compile Benchmark
This is one of the most requested benchmarks for STH over the past few years. We (finally) have a Linux kernel compile benchmark script that is consistent. Expect to see this functionality migrate into Linux-Bench soon (we are just awaiting the parser work on it.) The task was simple, we have a standard configuration file, the Linux 4.4.2 kernel from kernel.org, and make with every thread in the system. We are expressing results in terms of complies per hour to make the results easier to read.
As you can see, the dual Xeon E5-2695 V4 falls right in-line with what we would expect from this chip given its specs.
c-ray 1.1 Performance
We have been using c-ray for our performance testing for years now. It is a ray tracing benchmark that is extremely popular to show differences in processors under multi-threaded workloads.
Our dual Xeon E5-2695 V4 configuration is clearly showing just how much of an improvement we are seeing even over Sandy Bridge generation parts. This is due to the greatly increased core counts with Broadwell-EP.
7-zip Compression
7-zip is a widely used compression/ decompression program that works cross platform. We started using the program during our early days with Windows testing. It is now part of Linux-Bench.
Compression is a major operation we see in today’s workloads so we load the CPU with compression and decompression tasks. As one can see, performance is again excellent.
NAMD Performance
NAMD is a molecular modeling benchmark developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. More information on the benchmark can be found here.
Somewhat surprisingly we find the Intel Xeon E5-2697 V4 perform better than several of the older configurations we have tested. Performance was very good on this complex benchmark.
Sysbench CPU test
Sysbench is another one of those widely used Linux benchmarks. We specifically are using the CPU test, not the OLTP test that we use for some storage testing.
We received some feedback that the single-threaded results we had were all within a fairly tight band while the multi-threaded results showed a very high delta. We are removing those results however you can compare results using Linux-Bench‘s interactive charting feature.
OpenSSL Performance
OpenSSL is widely used to secure communications between servers. This is an important protocol in many server stacks. We first look at our sign tests:
Moving to the verify results:
OpenSSL is a hot topic right now and you can see the impact of high core counts on the benchmark on both the sign and verify side of the benchmark.
UnixBench Dhrystone 2 and Whetstone Benchmarks
Of course, these chips are not meant for heavy compute but we pick out the UnixBench 5.1.3 Dhrystone 2 and Whetstone results to show some of the raw performance they are capable of. UnixBench is widely used so it is a good comparison point.
Here are the single threaded workloads:
As you can see, the results are relatively tight on our single threaded benchmarks likely due to the fact that compilers are good these days and these are not the most demanding tests out there.
Now the E5 V4’s sweet spot, the multi-threaded workloads:
Here we can see another strong showing. It is worth noting that the two processors have a tray price of around $2400. That figure is quite a bit more than some of these other configurations.
Conclusion
Again we see the Intel Xeon E5-2600 V4 trend continue with the E5-2695 V4. THe chips are incrementally faster than their previous generation counterparts but not by such a large margin where there is a major reason to upgrade from V3 to V4. On the other hand, the E5-2695 V4 is about 60% more expensive, per chip, than the E5-2670 V1, yet offers a 2x-3x performance improvement. We are starting to see where consolidation can greatly benefit lower power older E5 machines.
You can find more STH Xeon E5 V4 coverage here:
- Intel Xeon E5-2600 V4 Line-up and Architectural Overview
- Intel Xeon E5-2699 V4 Benchmarks
- Intel Xeon E5-2698 V4 Benchmarks
- Intel Xeon E5-2697 V4 Benchmarks
- Intel DC P3700 and D3600 dual port NVMe
- Intel DC P3520 and DC P3320 NVMe SSDs
Subscribe to STH to get the latest benchmarks and platform reviews as they are published. We have a huge back log of content coming.
Is there any chance we can get the E5 2697 V2 performance in the Linux Compile chart? It’s the only chart that it is missing. The reason I ask, is that comparing apples to apples, it’s interesting to see the performance differences between the three generations of 2697 SKU.