Lately I’ve been playing with different GPU solutions in the Cloud; Amazon, Azure and Softlayer. Here is some experience and thoughts I have about it.
I believe GPU in the cloud is going to be a big thing, mostly for machine learning but also for graphics. Here is why.
- You pay only for what you use. Easy to scale up and down
- No need for hardware refresh any more, cloud providers will (hopefully) offer latest technology in a reasonable timeframe
- You can share resources between users in your company or with partners and contractors.
- Enable a global workforce without having to have datacenters in all regions.
I’ve looked at three cloud providers, Amazon, Azure and Softlayer. What can they offer? I will focus on GPU for graphics only.
Amazon EC2 G series
Amazon has 2 options for graphics. The P series is for compute, only G is for Graphics.
You can either have one of 4 Nvidia Grid K520 GPU’s
Looking at the specs of the K520 this GPU is clearly made for Cloud Gaming, so be careful if you want to use this for CAD apps. Drivers does not provide full QUADRO functionality which many CAD apps rely on.
You should also be aware that if you plan to run OPENGL apps on XenApp or RDSH with this GPU, it’s no longer supported in newer Nvidia Drivers (after build 340) https://support.citrix.com/article/CTX202066
For me this limits the use case for Graphical workloads in the cloud. You can probably still use it for some apps, but make sure you test it first.
Since it has been designed for gaming I tried running some games on it, and it did not work well with newer games (Battelfield 1), because they required newer Nvidia Drivers 375 build and this driver did not support this type of GPU, I was only able to update to 367 drivers. I was able to run older games like Crysis 3, but since I have 40ms latency to nearest cloud Datacenter, the latency was too high to be able to play First person shooter games.
Also tried World of tanks, this game was actually quite all right to play even on 40ms latency.
Just for fun I tried to run the Steam VR benchmark and it was averaging about 87FPS which is just not good enough for VR games. Uningine heaven benchmark was averaging about 49FPS with peaks at 99FPS. My conclusion for cloud gaming is that you need to have very high and stable bandwidth (30-40Mbit peaks) and very low latency to datacenter. So if you live in a city near a cloud datacenter It can work nice. I still think the K520 is an old GPU (from 2013) whith limited driver support from Nvidia so I hope they will provide better GPU’s in the future.
While writing this blog Amazon has recently announced a new range of GPU, Elastic GPU which is GPU sharing. I don’t know what kind of GPU this is but I hope it is based on Nvidia Grid M60. From what I understand it is software based sharing not hardware based like with vGPU. But from the specs I can see there is a 8GB framebuffer option which Amazon does not have in their GPU today so obviously they will soon get new GPU types.
Looking at the Amazon datacenter map there is still a lot of locations that are FAR away from the datacenters. I will write a little bit later about latency limitations for Graphics in the cloud.
From December 1st 2016 Azure now provides several different options for GPU in the Cloud. The N series.
There is the NV series and NC. NC is for compute and is a Nvidia K80 GPU. Let’s look at the NV series.
NV is based on Nvidia M60. Since Azure runs on Hyper-V there is no vGPU support only Discreet Device Assignment that gives you pass-through access to a physical GPU. You should be aware of most GPU applications will only use on GPU even when you have multiple GPU’s assigned. You have to consider this when selecting NV12 or NV24. You can still benefit from these larger instances if you need more CPU or memory for your apps. When benchmarking Azure Nv6 with Unigine Heaven benchmark I got average 88FPS which is 1.8 times better the Amazon G2
With no vGPU support you have to decide whether to run in VDI mode or RDSH. In VDI mode you get an entire M60 per users. In RDSH mode you can share the M60 with multiple users. To figure out which kind of users you have, try my GPUSizer tool. Maybe a mix will work good, lite/medium users on RDSH/XenApp or heavy users on VDI/XenDesktop. With VDI, you can still share the GPU, it’s called time sharing. You turn of the VM when users are not at work. For a full 24hrs an NV6 will cost you 540USD per month in US and 1200USD per month in Europe. Interesting to see pricing is 2x in Europe VS US, and 2,5x in Asia vs US. Most people only work 8 hrs a day, 5 days a week, average about 22 days a month, which will be 280USD per user per month in Europe and 128 USD per month in US. Most users also attend meetings and travel and do other things while working so it may be even less, but we have to add some time for machine to stay idle before we turn it off. Also remember that buying a physical workstation in this class can cost like 4-6000USD, now they can access a powerful cloud workstation from a cheap thin client.
In the RDSH/XenApp scenario, the virtual machines have to stay on all the time, maybe with less capacity on off-peak hours but now you can share it with multiple users. So let’s say 10 medium users or 20 light users will cost like 54 or 26USD per user per month. You have to remember that RDSH/XenApp has some limitations vs XenDesktop. Some apps do not work with XenApp/RDSH and some are not supported. But for those who do, remember that XenApp requires more bandwidth than XenDesktop for graphical applications and it does not support H.264 hardware encoding which will reduce latency up to 51ms, read more about why here. So for cloud workload where you may have higher latency to neares datacenter, h.264 hardware encoding and decoding is important. But this is only available with XenDesktop (Workstation VDA) (Also VMWare Horizon but they do not support Azure).
You must install the Workstation VDA with the /enable_HDX_3d_pro and /servervdi option https://docs.citrix.com/en-us/xenapp-and-xendesktop/7-9/install-configure/server-vdi.html
You must also purchase a license for nvidia Grid if you use it with RDSH and XenApp in the cloud.
Here are some videos of XenApp running on Azure N-series
This one is quite impressive, a 4K youtube video played over XenApp
This is Autodesk 3DStudio that runs quite ok on XenApp
Next one shows unigine benchmark. This one produces more than 100 FPS, but XenApp is only able to deliver about 20FPS. This would probably be better with XenDesktop, but have to try this later.
So how do RemoteFX on server 2016 compare to Citrix HDX? Quite similar in this case actually.
All videos was recorded on cabled network with high speed fiber internet and 37-39 ms latency to Azure datacenter. No optimizers on the network.
Looking at the Azure map, GPU instances are only available in South Central US, East US, West Europe and South East Asia. Look at the latency from your office locations and check the latency to nearest datacenter to find out where to put your graphical virtual machines. You can use sites like https://wondernetwork.com/pings and http://www.azurespeed.com/Azure/Latency to get an idea of latency. How much latency is acceptable depends on your applications. For gaming even 1ms latency may not be good enough. For some CAD apps like Autocad, there is a remote cursor so even on LAN (1ms) the user can get annoyed by the latency.
I’ve also experience that using graphical applications over WAN is very sensitive to network jitter. So always consider a dedicated network connection or run a WAN optimizer to ensure quality of service. For Azure you can get Express route, but this is quite expensive for smaller companies. You should also notice that working with graphical application over a poor Wi-Fi connection adds a lot more latency to the equation, so I always recommend to work on cabled network if possible.
IBM softlayer is a good option for graphical workloads in the cloud. They offer Nvidia Grid K2 and M60 on dedicated physical servers. For K2 you can also get a per hour pricing but for M60 you can only get it per month. The nice thing with this is that you can install your own hypervisor like VMWare or Xenserver which will give you full support for Nvidia vGPU. In other words, you manage it pretty much like you would in your own datacenter, only you pay per month. It means there’s more infrastructure to configure and manage but more flexible with hardware options.
Here is what they offer with K2
Here is with M60
Not that prices depends on how you configure the hardware, how may CPU, how much memory and what kind of disks you need.
Here is the Softlayer Datacenter map. This is the only provider that has datacenter in the Scandinavia (where I work), hopefully with GPU too soon.
Graphics ins the cloud is smart way to be able to collaborate globally at optimal cost. You have to know your limits in apps, datacenters, latency, and capacity to get it right. If you need any help on implementing GPU in Cloud or in your own datacenter, you can hire me to assess and design the solution. You can also find a lot of useful help and tools on this blog. Use the contact form if you want to contact me.