Amazon AWS EC2 P3 Instances with NVIDIA Tesla V100

October 25, 2017

Today is a big day. Amazon AWS launched new deep learning / AI focused instances featuring the NVIDIA Tesla V100 GPUs. The NVIDIA Tesla V100 GPUs have NVIDIA’s latest technology including the new Tensor Core. We highlighted the Tensor Core benefits and the NVIDIA Tesla V100 architecture in our NVIDIA Tesla V100 Volta Update at Hot Chips 2017. Suffice to say, these are monster chips and are highly anticipated for the deep learning / AI crowd. At the same time, the V100 is very costly so if you are infrequently training models the ~$150K rumored going price for an 8x NVIDIA Tesla V100 server may be less cost-effective than running the workload in the cloud.

Each NVIDIA Tesla V100 Volta-generation GPU has 5,120 CUDA Cores and 640 Tensor Cores. That provides 125 TFLOPS of mixed-precision performance, 15.7 TFLOPS of single precision (FP32) performance, and 7.8 TFLOPS of double precision (FP64) performance.

The Amazon AWS EC2 P3 instances also include NVLink for ultra-fast GPU to GPU communication. At STH, we have figures already for 8-way NVLink enabled Tesla P100 and V100 systems that we will be publishing soon.

Amazon AWS EC2 P3 Instances Specs and Pricing

Here is a table with the three instance types and hourly pricing based off of the N. Virginia list.

Instance Size	Tesla V100 GPUs	GPU Peer to Peer	GPU Memory (GB)	vCPUs	Memory (GB)	Network Bandwidth	EBS Bandwidth	$/hour
p3.2xlarge	1	N/A	16	8	61	Up to 10Gbps	1.5Gbps	$3.06
p3.8xlarge	4	NVLink	64	32	244	10Gbps	7Gbps	$12.24
p3.16xlarge	8	NVLink	128	64	488	25Gbps	14Gbps	$24.48

Not every AZ has the P3 instances at the time of publication.

If you do have a need for AWS EC2 P3 instances on a regular basis, a 12-month all up-front reserved term is only $136,601 which is an absolute bargain compared to our estimate of just under $160,000 for an 8x Tesla V100 server plus power cooling and networking.

Final Words

Make no mistake, NVIDIA is shipping the Tesla V100. As AWS launches the p3.2xlarge, p3.8xlarge, and p3.16xlarge instances with 1, 4 and 8 of the GPUs respectively, we expect Google and others to follow suit in short order.

The bigger implication here is that NVIDIA has been shipping Volta for some time now and we are now well beyond the point when consumers get a new architecture first. We also wonder at what point NVIDIA will have its architectures diverge. The Tensor Core is a big deal from what we are seeing on the deep learning side, but it is less useful in a gaming scenario.

You can read more about the AWS announcement here.

Amazon AWS EC2 P3 Instances Specs and Pricing

Final Words

RELATED ARTICLESMORE FROM AUTHOR

Running the Deepseek-R1 671B Model at FP16 Fidelity Alongside Virtualized Workloads

MLPerf Inference v5.0 Results Released

Dell Precision 3280 Compact Review A Neat Compact Workstation

LEAVE A REPLY

RELATED ARTICLES MORE FROM AUTHOR