Graphics Card (GPU) based render engines such as Redhift3D, Octane or VRAY-RT have matured quite a bit and are starting to overtake CPU-based Render-Engines.
But what hardware gives the best-bang-for-the-buck and what do you have to keep in mind when building your GPU-Workstation compared to a CPU Rendering Workstation?
Building a 3D Modeling and CPU Rendering Workstation can be somewhat straightforward, but highly optimizing for GPU Rendering is a whole other story.
So what are most affordable and best PC-Parts for rendering with Octane, Redhsift3D, VRAY-RT or other GPU Render Engines?
Let’s take a look:
Best Hardware for GPU Rendering
Since GPU-Render Engines use the GPU to render, technically you should go for a max-core-clock CPU like the Intel i9 9900K that clocks at 3,6GHz (5Ghz Turbo) or the AMD Ryzen 9 3900X that clocks at 3,8Ghz (4,6Ghz Turbo).
That said though, there is another factor to consider when choosing a CPU: PCIe-Lanes.
GPUs are attached to the CPU via PCIe-Lanes on the motherboard. Different CPUs support different amounts of PCIe-Lanes and Top-tier GPUs usually need 16x PCIe 3.0 Lanes to run at full performance.
The i9 9900K/3900X have 16 GPU<->CPU PCIe-Lanes, meaning you could use only one GPU at full speed with these type of CPUs.
If you want more than one GPU at full speed you would need a different CPU that supports more PCIe-Lanes like the AMD Threadripper CPUs, that have 64 PCIe-Lanes (e.g. the AMD Threadripper 2950X or Threadripper 3960X) or on the Intel side, the i9 10900X Series CPUs that support 48 PCIe-Lanes (e.g. i9 10980XE).
GPUs, though, can also run in lower bandwidth modes such as 8x PCIe 3.0 Speeds and then also use up fewer PCIe-Lanes (namely 8x). Usually, there is a negligible difference in Rendering Speed when having current gen GPUs run in 8x mode instead of 16x mode.
This would mean you could run two GPUs on an i9 9900K or Ryzen 9 3900X in x8 PCIe mode. (For a total of 16 PCIe Lanes)
You could theoretically also run 4 GPUs in x16 Mode on a Threadripper CPU (= 64 PCIe Lanes). Unfortunately this is not supported though, and the best you can get with Threadripper CPUs is a x16, x8, x16, x8 Configuration.
CPUs that have a high number of PCIe-Lanes usually fall into the HEDT (= High-End-Desk-Top) Platform range and are usually also great for CPU Rendering as they tend to have more cores and therefore higher multi-core performance.
Here’s a quick bandwidth comparison between having two Titan X GPUs run in x8/x8, x16/x8 and x16/x16 mode. The differences are within the margin of error.
Beware though, that the Titan X’s in this benchmark certainly don’t saturate a x8 pcie 3 bus. With upcoming GPU generations this might change. A current gen 2080Ti for example already saturates a x8 pcie 3.0 Bus in terms of bandwidth.
When actively rendering and your scene fits nicely into the GPUs VRAM, the speed of GPU Render Engines is of course mainly dependent on GPU performance.
Some processes though that happen before and during rendering rely heavily on the performance of the CPU, Storage, and (possibly) network.
For example, extracting and preparing Mesh Data to be used by the GPU, loading textures from your Storage and preparing the scene data.
In very complex scenes, these processing stages will take lots of time and can bottleneck the overall rendering speed, if a low-end CPU, Disk, and RAM are employed.
If your scene is too large to fit into your GPU’s memory, the GPU Render Engine will need to access your System’s RAM or even swap to disk, which will considerably slow down the rendering.
Best Memory (RAM) for GPU Rendering
Different kinds of RAM won’t speed up your GPU Rendering all that much. You do have to make sure, that you have enough RAM though, or else your System will crawl to a halt.
I recommend keeping the following rules in mind to optimize performance as much as possible:
- To be safe, your RAM size should be at least 1.5 – 2x your combined VRAM size
- Your CPU can benefit from higher Memory Clocks which can in turn slightly speed up the GPU rendering
- Your CPU can benefit from more Memory Channels on certain Systems which in turn can slightly speed up your GPU rendering
- Look for lower Latency RAM (e.g. CL14 is better than CL16) which can benefit your CPU’s performance and can therefore also speed up your GPU rendering slightly
Take a look at our RAM (Memory) Guide here, which should get you up to speed.
If you just need a quick recommendation, look into Corsair Vengeance Memory, as we have tested these Modules in a lot of GPU Rendering systems and can recommend them without hesitation.
Best Graphics Card for Rendering
To use Octane and Redshift you will need a GPU that has CUDA-Cores, meaning you will need an NVIDIA GPU. VRAY-RT additionally supports OpenCL meaning you could use an AMD card here. If you are using other Render Engines, be sure to check compatibility here.
The best bang-for-the-buck NVIDIA cards are:
- RTX 2060 Super (2176 CUDA Cores, 8GB VRAM)
- RTX 2070 Super (2560 CUDA Cores, 8GB VRAM)
- RTX 2080 (2944 CUDA Cores, 8GB VRAM)
- RTX 2080 Ti (4352 CUDA Cores, 11GB VRAM)
On the high-end, the currently highest possible performance is offered by the NVIDIA Titan V and Titan RTX, that also come with 24GB of Video RAM.
These Cards though have worse Performance per Dollar as they are targeted at a different audience and VRAM is very expensive but not necessarily needed in such high capacities for GPU Rendering.
In my experience, 8GB – 11GB of VRAM is usually plenty for most scenes, unless you know you will be working on extremely complex projects.
Blower Style Cooler (Recommended for Multi-GPU setups)
- PRO: Better Cooling when closely stacking more than one card (heat is blown out of the case)
- CON: Louder than Open-Air Cooling
Open-Air Cooling (Recommended for single GPU Setups)
- PRO: Quieter than Blower Style, Cheaper, more models available
- CON: Bad Cooling when stacking cards (heat stays in the case)
Hybrid AiO Cooling (All-in-One Watercooling Loop with Fans)
- PRO: Best All-In-One Cooling for stacking cards
- CON: More Expensive, needs room for radiators in Case
Full Custom Watercooling
- PRO: Best temps when stacking cards, Quiet, some cards only use single slot height
- CON: Needs lots of extra room in the case for tank and radiators, Much more expensive
NVIDIA GPUs have a Boosting Technology, that automatically overclocks your GPU to a certain degree, as long as it stays within predefined temperature and power limits. So making sure a GPU stays as cool as possible, will allow it to boost longer and therefore improve the performance.
You can see this effect especially in Laptops, where there is usually not much room for cooling, and the GPUs tend to get very hot and loud and throttle very early. So if you are thinking of Rendering on a Laptop, keep this in mind.
A quick note on Riser Cables. With PCIe- or Riser-Cables you can basically place your GPUs further away from the PCIe-Slot of your Motherboard. Either to show off your GPU vertically in front of the Case’s tempered glass side panel, or because you have some space-constraints that you are trying to solve (e.g. the GPUs don’t fit).
If this is you, take a look at our Guide on finding the right Riser-Cables for your need.
Be sure to get a strong enough Power supply for your system. Most GPUs have a Power Draw of around 180-250W.
I Recommend a 550W for a Single-GPU-Build. Add 250W for every additional GPU that you have in your System. Good PSU manufacturers to look out for are Corsair, beQuiet, Seasonic, and Coolermaster but you might prefer others.
There is a Wattage-Calculator here that lets you Calculate how strong your PSU will have to be by inputting your planned components.
Mainboard & PCIe-Lanes
Make sure the Mainboard has the desired amount of PCIe-Lanes and does not share Lanes with SATA or M.2 slots. Also, be careful what PCI-E Configurations the Motherboard supports. Some have 3 or 4 physical PCI-E Slots but only support one x16 PCI-E Card (electrical speed).
This can get quite confusing. Check the Motherboard manufacturer’s Website to be sure the Multi-GPU configuration you are aiming for is supported. Here is what you should be looking for in the Motherboard specifications:
In the above example, you would be able to use (with a 40 PCIe Lane CPU) 1 GPU in x16 mode. OR 2 GPUs in both x16 mode OR 3 GPUs one in x16 mode and two of those in x8 mode and so on. Beware that 28-PCIe Lane-CPUs in this example would support different GPU configurations than the 40 lane CPU.
Currently, the AMD Threadripper CPUs will give you 64 PCIe Lanes to hook your GPUs up to, if you want more you will have to go the multi-CPU route with Intel Xeons.
To confuse things a bit more, some Mainboards do offer four x16 GPUs (needs 64 PCIe-Lanes) on CPUs with only 44 PCIe Lanes. How is this even possible?
Enter PLX Chips.
On some motherboards, these chips serve as a type of switch, managing your PCIe-Lanes and leads the CPU to believe fewer Lanes are being used. This way, you can use e.g. 32 PCIe-Lanes with a 16 PCIe-Lane CPU or 64 PCIe-Lanes on a 44-Lane CPU.
Beware though, only a few Motherboards have these PLX Chips. The Asus WS X299 Sage is one of them, allowing up to 7 GPUs to be used at 8x speed with a 44-Lane CPU, or even 4 x16 GPUs on a 44 Lanes CPU.
This screenshot of the Asus WS X299 Sage Manual clearly states what type of GPU-Configurations are supported (Always check the manual before buying expensive stuff):
For Multi-GPU Setups, having a CPU with lots of PCIe-Lanes is important, unless you have a Mainboard that comes with PLX chips. Having GPUs run in x8 Mode instead of x16, will only marginally slow down the performance on most GPUs. (Note though, the PLX Chips won’t increase your GPU bandwidth to the CPU, just make it possible to have more cards run in higher modes)
Best GPU Performance / Dollar
Ok so here it is. The Lists everyone should be looking at when choosing the right GPU to buy. The best performing GPU per Dollar!
GPU Benchmark Comparison: Octane
This List is based on OctaneBench 4.00.
|GPU Name||VRAM||OctaneBench||Price $ MSRP||Performance/Dollar|
|RTX 2060 Super||8||203||420||0.483|
|RTX 2070 Super||8||220||550||0.400|
|GTX 1070 Ti||8||153||450||0.340|
|RTX 2080 Super||8||233||720||0.323|
|GTX 1080 Ti||11||222||700||0.317|
|RTX 2080 Ti||11||304||1199||0.253|
|GTX TITAN Z||12||189||2999||0.063|
|GPU Name||VRAM||OctaneBench||Price $ MSRP||Performance/Dollar|
GPU Benchmark Comparison: Redshift
The Redshift Render Engine has its own Benchmark and here is a List based off of Redshift Bench. Note how the cards scale (1080TI) [RedshiftBench Mark (Time [min], shorter is better)]:
|GPU Name||VRAM||RedshiftBench||Price $ MSRP||Performance/Dollar|
|GTX 1080 Ti||11||11.44||700||1.248|
|4x GTX 1080 Ti||11||3.07||2800||1.163|
|2x GTX 1080 Ti||11||6.15||1400||1.161|
|8x GTX 1080 Ti||11||1.57||5600||1.137|
|RTX 2080 Ti||11||8.38||1199||0.995|
|4x RTX 2080 Ti||11||2.28||4796||0.914|
|RTX 2080 Super||8||10.15||720||1.368|
|RTX 2070 Super||8||11.17||550||1.627|
|RTX 2060 Super||8||12.17||420||1.956|
|GPU Name||VRAM||RedshiftBench||Price $ MSRP||Performance/Dollar|
GPU Benchmark Comparison: VRAY-RT
And here is a List based off of VRAY-RT Bench. Note how the GTX 1080 interestingly seems to perform worse than the GTX 1070 in this benchmark:
|GPU Name||VRAM||VRAY-Bench||Price $ MSRP||Performance/Dollar|
|GTX 1070||8||1:25 min||400||2.941|
|RTX 2070||8||1:05 min||550||2.797|
|GTX 1080 TI||11||1:00 min||700||2.380|
|2x GTX 1080 TI||11||0:32 min||1400||2.232|
|GTX 1080||8||1:27 min||550||2.089|
|4x GTX 1080 TI||11||0:19 min||2800||1.879|
|TITAN XP||12||0:53 min||1300||1.451|
|8x GTX 1080 TI||11||0:16 min||5600||1.116|
|TITAN V||12||0:41 min||3000||0.813|
|Quadro P6000||24||1:04 min||3849||0.405|
Source: VRAY Benchmark List
Speed up your Multi-GPU Rendertimes
Note – This section is quite advanced. Feel free to skip it.
So, unfortunately, GPUs don’t always scale perfectly. 2 GPUs render an Image about 1,9 times faster. Having 4 GPUs will only render about 3,6x faster. This is quite a bummer, isn’t it?
Having multiple GPUs communicate with each other to render the same task, costs so much performance, that a large part of one GPU in a 4-GPU rig is mainly just there for managing decisions.
One solution could be the following: When final rendering image sequences, use as few GPUs as possible per task.
Let’s make an example:
What we usually do in a multi-GPU rig is, have all GPUs work on the same task. A single task, in this case, would be an image in our image sequence.
4 GPUs together render one Image and then move on to the next Image in the Image sequence until the entire sequence has been rendered.
We can speed up preparation time per GPU (when the GPUs sit idly, waiting for the CPU to finish preparing the scene) and bypass some of the multi-GPU slow-downs when we have each GPU render on its own task. We can do this by rendering one task per GPU.
So a machine with 4 GPUs would now render 4 tasks (4 images) at once, each on one GPU, instead of 4 GPUs working on the same image, as before.
Some 3D-Software might have this feature built-in, if not, it is best to use some kind of Render Manager, such as Thinkbox Deadline (Free for up to 2 Nodes/Computers).
Beware though, that you might have to increase your System RAM a bit and have a strong CPU since every GPU-Task needs its amount of RAM and CPU performance.
Redshift vs. Octane
Another thing I am asked often is if one should go with the Redshift or Octane.
As I myself have used both extensively, in my experience, thanks to the Shader Graph Editor and the vast Multi-Pass Manager of Redshift, I like to use the Redshift Render Engine more for doing work that needs complex Material Setups and heavy Compositing.
Octane is great if you want results fast, as it’s learning curve is shallower. But this, of course, is a personal opinion and I would love to hear yours!
If you want to get the best parts within your budget you should have a look at the Web-Based PC-Builder Tool that we’ve created.
Select the main purpose that you’ll use the computer for and adjust your budget to create the perfect PC with part recommendations that will fit within your budget.
What Hardware do you want to buy? Let me know in the comments!