Well done NVIDIA for giving us the next generational leap in graphics cards, I am drawing particular reference to Pascal’s lower power consumption and GDDR5X RAM. But hold on, what is Pascal?
If Pascal and GDDR5X had a baby
At the heart of that power efficiency is the GP104 GPU, which is built on a small 314mm², 16nm die and an all-new architecture, called Pascal. Pascal is essentially a better version of Maxwell, the previous generation that the 980Ti & Titan X are based on. There are the same four Graphics Processing Clusters (GPC), each of which contains a collection of Streaming Multiprocessors (SM) bound to a total of 64 Raster Operators (ROP) and 2MB of L2 cache. With more SMs (streaming multiprocessors) crammed into each GPC this gives the 1080 a grand total of 2,560 CUDA cores and 160 texture units, a 20% increase over the 2,048 CUDA cores and 128 texture units of the 980.
GP104 isn’t the first Pascal part—that honour goes to the GP100 chip of the Tesla P100, a professional GPU bound for servers and high-performance computing. The P100 is a far bigger chip at 610mm², and as such crams in an extra two GPCs to make up its array of 3,584 CUDA cores. A key difference is that the P100 splits its cores down the middle, with half of them devoted to science-focused FP64 cores and half to the FP32 cores more useful for gaming.
Coupled with a dramatic boost in clock speed—1,690MHz base and 1,733MHz boost—this means the 1080 pushes almost the same level of FP32 performance as the P100—nine teraflops versus 10.2 teraflops—in a smaller, more efficient, and much cheaper chip. Maxwell topped out at just 1,000MHz at stock, and even the most talented of overclockers would struggle to get past 1,300MHz without the aid of some seriously exotic cooling.
In the early days of Pascal, most rumours pointed to Nvidia using second-generation High Bandwidth Memory (HBM2), particularly as AMD used the original version of HBM to great effect in its Fury range. While the P100 does use HBM2 for its incredible 720GB/s of bandwidth, pricing and availability concerns surrounding the technology have pushed Nvidia towards using Micron’s GDDR5X memory for the 1080. While not quite as impressive as HBM or HBM2, the 8GB of GDDR5X in the 1080 is essentially a drop-in replacement for GDDR5, boasting a 10,000MHz memory clock (versus 7,000MHz on the 980) attached to a 256-bit bus for 320 GB/s of bandwidth.
That’s another big leap over the 980 with its 224GB/s of bandwidth, but does come in slightly under the 336GB/s of the Titan X and GTX 980 Ti with their wider 384-bit buses, or the Fury X’s 4,096-bit bus. Still, GDDR5X operates at the same 1.35V as GDDR5, giving Nvidia the option to slot in standard GDDR5 across the rest of its range (as it plans to do with the GTX 1070). Plus there’s room to grow if Nvidia does implement GDDR5X on a wider 384-bit bus on future cards, which would result in an impressive 480GB/s of bandwidth.
Nvidia has improved its delta colour compression technology in Pascal too. Where Maxwell featured 2:1 compression—that is, where the GPU calculates the colour difference between a range of pixels and halves the data required if the delta between them is small enough—Pascal can do 4:1 or even 8:1 colour compression with small enough deltas. The result is that Pascal significantly reduces the number of bytes that have to be fetched from memory each frame for roughly 20 percent additional effective bandwidth. Clever stuff.
What is Simultaneous Multi-Projection, put simply, it is a brand new hardware rendering pipeline for Pascal cards only that allows the new generation of cards to render 16 independent “viewpoints” in a single rendering pass. On a regular graphics card, a single viewpoint—i.e. what a user sees on a monitor—is rendered in one pass. That’s fine for most applications, but problems occur with multi-monitor setups and VR. In a triple-monitor setup where a user curves the monitors, the graphics card can only render a single viewpoint. It assumes all the monitors are arranged in a straight line, resulting in the images on the left and right monitors looking warped.
Traditionally, this problem is solved by using three separate graphics cards in supported games, but with multi-projection, the single GPU can render three different viewpoints so two of them correct the distortion. Nvidia uses a similar technique to speed up VR rendering, allowing for a stereo image to be rendered in a single pass. This dramatically improves the frame rate—a particularly big problem to solve when VR needs to run at a hefty 90 FPS. Without a VR headset or multiple monitors to trial, I can’t say for sure how well this works just yet. In live demos it was impressive, and developers apparently don’t need to do a thing to see the performance benefits.
And what about the Founders Edition?
Nvidia confused many with the introduction of the Founders Edition, but in reality, Founders Edition is simply a new name for reference cards that have been graced with a higher price tag. Nvidia says the higher price goes towards an upgraded five-phase power supply and the angular vapour chamber shroud that looks likes it’s been crafted out of triangular pieces of aluminium and glass.
Buyer beware though, the Founders Edition carries a hefty price increase which has been amplified even more for the European market. Why? There are of course cheaper alternatives made by the usual hardware partners.
The GTX 1080 is a generational leap. Like the Titan before it, this release should redefine consumer graphics card performance—and in many ways, that seems to be the case. Yes, the GTX 1080 is the new world’s fastest graphics card, and yes, it’s faster than the likes of the now-redundant GTX 980 Ti and Titan X by as much as 35 percent in real-world use although this does not pertain to professional use. Compared to the 980, in fact, it’s faster by as much as 62 percent. For those that want the very best graphics card right now, the 1080 is it.
Still, if I was a big gamer and a light pro user its perfect, reverse this equation and the Titan X with its 500 extra CUDA cores and 4GB VRAM makes a lot more sense. This is of course disregarding the fact that the 1080 has not been in the marketplace yet and could be hiding any number of manufacturing glitches.
The are plenty of innovations here. The 1080 is the first consumer graphics card based on Nvidia’s new Pascal architecture, the first with GDDR5X memory, and the first to be manufactured on a smaller, more efficient TSMC 16nm FinFET manufacturing process.
As such, 1080 is the latest in a long line of impressive updates from Nvidia. For those still using a 680 or a 780—the performance improvements in the 1080 will be more than enough to justify a purchase. But for the professionals, its great, but not great enough. All those improvements have gone towards the gaming arena and with VR looking to take the crown of large scale media ingestion by humans, Nvidia had to bring something out to get an early stranglehold on the market before AMD. Our advice for professional use, stick with 980Ti & Titan X, GPU core clock doesn’t matter in terms of CUDA core speed, shader clock does. Nvidia GTX 1080, great for gaming.