Intel Atom C3958 16-Core Top End Embedded QAT Linux Benchmarks and Review

11
Gigabyte MA10 ST0 Top
Gigabyte MA10 ST0 Top

Pivoting slightly from our focus on high-end, and high-power server CPUs, we have the Intel Atom C3958 performance benchmarks under Linux. We have already published benchmarks on the Intel Atom C3338, C3558 and C3955 which are instructive for other points of reference within the Intel Atom C3000 series. While the Intel Atom C3958 does not have the clock speeds to match the Intel Atom C3955 series, it still has 16 cores. What it can claim is that it is the highest-bin QuickAssist part in the current Intel Atom C3000 series lineup.

A Quick Word on Intel QuickAssist

In 2016 we published a few articles around using QuickAssist with OpenSSL and for 40GbE VPN acceleration. In the meantime, Intel has now launched a 100Gbps QAT version and has built QAT into the Burgeoning Intel Xeon SP Lewisburg PCH Options. We will have some cool QAT results soon. For now a few notes:

  1. Intel Atom C3000 and Intel Atom C2000 QAT have different features and therefore do not use the same driver version.
  2. You do need Intel Atom C3xxx compatible QAT drivers.
  3. The Intel QAT ecosystem is significantly stronger than it was in 2016. We speculate this is due to carrier network adoption and making the ecosystem more mature.

QuickAssist Technology acceleration still requires some effort as Intel is not adding it into every chip. Until Intel does so, we expect most software to require an additional step (or much more) getting QAT working.

Intel Atom C3xxx QAT Device
Intel Atom C3xxx QAT Device

We do have QAT working on the Intel Atom C3958 already and it is enumerated as a different device type than other QAT solutions as can be seen in the screenshot.

Gigabyte MA10 ST0 IQAT In BIOS
Gigabyte MA10 ST0 IQAT In BIOS

The iQAT can be disabled if that is desired. We wish Intel added this to every chip so it became automatic in applications as that would help QAT support considerably.

At the same time, there are only three reasons you would get a C3958 over a C3955: QAT support, extended lifecycle, and if a specific platform you wanted to use did not have a C3955 option. That makes the QAT support a significant piece of the puzzle.

Intel Atom C3958 Key Stats

Key stats for the Intel Atom C3958 series: 16 cores / 16 threads, 2.0GHz. Unlike the C3955, the C3958 does not feature turbo boost so 2.0GHz is also the maximum speed. The CPU features 31W TDP. This CPU also features a full 20x high-speed I/O lanes and has 4x10GbE making it top-bin in terms of features for QuickAssist parts. These chips are not socketed so end customer pricing will include a motherboard at a minimum. The CPU alone has a 1K unit tray price of $449. Virtualization features such as VT-d and SR-IOV are supported on this generation. Here is the ARK page for the CPU.

Also, for our readers who want to see feature flags, here is the Linux lscpu output:

Intel Atom C3958 Lscpu
Intel Atom C3958 Lscpu

Test Configuration

Our test configuration is very similar to what we used for our Intel Atom C2000 series reviews.

  • Motherboard: Gigabyte MA10-ST0
  • CPU: Intel Atom C3958
  • RAM: 4x 16GB DDR4-2400 RDIMMs (Micron)
  • SSD: Intel DC S3710 400GB
  • Boot device: Intel DC S3700 200GB

We are using the Gigabyte MA10-ST0 for our test platform. This is an absolutely stunning storage server solution with 16x SATA ports and onboard 10Gb SFP+ networking.

Gigabyte MA10 ST0 Top
Gigabyte MA10 ST0 Top

The board comes with an onboard 32GB eMMC storage from Kingston. For an embedded system this is an awesome feature. On this platform we expect this eMMC to be used as a boot device rather than a more expensive SATA DOM. The four SFF-8087 ports mean that using a SATA DOM is not easy on the platform in either case, but they provide easy connectivity to storage backplanes.

We will have a full review of the Gigabyte MA10-ST0 soon, but for those wondering, the maximum power consumption with 2x 10Gb SFP+ links (SR optics) and 2x 1GbE links we have seen is around 61W. We will publish formal figures with our platform reviews but this is certainly a solid low-power platform for the performance and connectivity you are getting.

11 COMMENTS

  1. I’m very confused. Where are the test results of benchmarks that actually use QAT here? I was expecting something in the SSL benchmarks but according to your numbers the non-QAT C3955 is faster than this chip that has QAT?

    It would be nice to see an article that focuses heavily on the QAT feature of this chip actually use that feature in some tests.

  2. @Don,

    In 2016 we published a few articles around using QuickAssist with OpenSSL and for 40GbE VPN acceleration. In the meantime, Intel has now launched a 100Gbps QAT version and has built QAT into the Burgeoning Intel Xeon SP Lewisburg PCH Options. We will have some cool QAT results soon. For now a few notes:

  3. Something that I saw when I went to Gigabytes website… using the PCIe slot disables 2 of the 4 SFF-8087 ports.

    I like the plethora of storage, but I’m not sure I have a need of the QAT… if I could find a use-case, this might be interesting. And I’m not sure I like just the SFP+ ports since I haven’t tried running FO wires thru residential walls yet. A copper option would be nice.

  4. How board. Price? Full review?

    All the commentary around QA was helpful. I don’t know when STH started doing it but I like this new direction.

    Your comparison dataset is real useful since that’s all the competition almost. I can extrapolate more data points.

  5. @Eddie – If you want copper then get a 10Gb copped SFP+ transceiver.

    I would trade the QAT for a GPU on it for transcoding anyday but this isn’t what these things are aimed at.

  6. @Goose – I’ve seen those but haven’t tried to figure out what they cost yet. Depending on the mobo manufacturer, C3xxx SKUs can offer either, both or neither of the 10GBe standards. Take the Supermicro they reviewed earlier, I think it had 2x of each. Of course that one only offered 12 SATA ports vs this one.

  7. Please redo the openssl tests enabling QAT, to check if QAT engine is available run:
    openssl engine
    Testing:
    openssl speed -engine qat -elapsed -multi 2 -evp aes-128-cbc-hmac-sha1
    (more info in 01.org/intel-quickassist-technology)

  8. I noticed every single motherboard doesn’t offer RAID capability as opposed to Intel C236/238 chipsets which do offer at least some basic ones.

    That means additional investment and +10/15W of power consumption, which in case of RAID 1 (which is only thing I need) is really questionable why should I pick this platform instead of XEON 45W one. For 5W of power savings in best case scenario?

    Really? Can someone elaborate how are you sorting RAID, especially RAID 1 and RAID 10 with say 8 disks on low power server. Purchasing £400 pound RAID card and using additional *precious* PCIe slot (which are rare on mini-ITX) while consuming more power when ALL THAT comes for free in case of XEON C236 solution, I don’t see a point.

    Please elaborate!

  9. Karol – I would suspect very few Atom C3958 users are going to use hardware RAID. Even fewer will use chipset RAID. Most applications will use software RAID which is great on this platform.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.