When you've been a technology journalist as long as I have, you're all too familiar with the never-ending game of performance leapfrog played by Nvidia and AMD (it's still ATI to me). However, when Nvidia showed me its new GeForce GTX 680 a couple of weeks ago, I was pleased to find that raw performance wasn't its only party trick.
Much like Intel realised several years ago, Nvidia come to the conclusion that power efficiency is as, if not more important than all out performance. That's not to say that Nvidia's new high-end graphics solution doesn't perform, because it absolutely delivers in that department, but it's the way in which it does it that's most interesting.
The GTX 680 is based on Nvidia's new Kepler architecture, which is manufactured using TSMC's new 28nm process. This new fabrication process is key to Kepler's reduced power needs, with the GTX 680's power draw peaking at 195W, making it far less power hungry than its predecessor (the GTX 580 draws 244W), despite providing significantly higher levels of performance.
The GTX 680 is made up of four Graphics Processing Clusters (GPC), each of which houses two Streaming Multiprocessors (SMX) and a dedicated raster engine, while a memory controller is assigned to each GPC. Each SMX is made up of CUDA cores, but whereas the GTX 580 employed 32 CUDA cores per SMX, the GTX 680 sports 192 CUDA cores per SMX, for a total of 1,536 CUDA cores.
The GTX 680's core clock speed is 1,006MHz, but that's not a static clock. Kepler is designed to dynamically alter the core clock speed in order to increase performance within the prescribed power envelope. Basically, this means that if the GTX 680 is drawing below 195W of power at 1,006MHz, the card will automatically raise the clock frequency until that 195W limit is reached.
The beauty of this dynamic overclocking is that it just works. There's no mucking about with utilities, creeping up voltage levels and holding your breath and hoping that your system won't crash in the middle of playing a game. With the chip constantly monitoring power usage and temperature, GPU Boost will ensure the best possible performance, while maintaining reliability. The Boost Clock is typically 1,058MHz, but can rise to speeds of 1,100MHz or higher as long as the power envelope isn't breached.
Of course the GTX 680 can dynamically clock down too, so as long as you're not pushing millions of polygons around a screen, the clock speed will drop accordingly.
Another interesting feature is Adaptive V-Sync, which addresses one of the most annoying issues of 3D gaming on a PC. As any PC gamer knows, V-Sync will synchronise the frame rate of your graphics card with the refresh rate of your monitor, and eliminate the texture tearing that can often occur when you're graphics card is pumping out more frames than your monitor can display. However, the problem with V-Sync is that when the frame rate drops below the refresh rate of the monitor, say 60Hz, it will essentially reduce the frame rate to 30Hz, even though the card itself has only dropped to 59fps.
Adaptive V-Sync avoids the problems above by being flexible in both directions. If the graphics card is pumping out more than 60fps it will apply V-Sync and limit at 60Hz to keep in tune with the monitor's maximum refresh. However, if the frame rate drops below 60Hz V-Sync will be disabled.
Nvidia has also done some major work when it comes to anti-aliasing, with the implementation of FXAA and TXAA. FXAA leverages the significant power of those 1,536 CUDA cores and can produce image quality equal to, or better than 4x MSAA with significantly increased frame rates. Because FXAA is integrated into the control panel, it doesn't require games to be tailored for it.
TXAA is designed to produce the best anti-aliasing results on moving images, which is what you're looking at in games most of the time. Yes, technology journalists might zoom in on diagonal lines in static 3D images to evaluate anti-aliasing, but that's rarely what you're doing in a game!
TXAA comes in two flavours, originally called "1" and "2". TXAA 1 provides visual quality comparable with 8x MSAA, but with a performance hit on par with 2x MSAA. TXAA 2 results in a performance hit comparable with 4x MSAA, but produces superior image quality to 8x MSAA.
Performance wise, the GTX 680 is set to raise the bar once again. At GDC 2011 Epic games showed off a demo utilising the Unreal 4 engine, running DX11. This Samaritan demo required three GTX 580 cards in SLI, but today it will run on a single GTX 680! But again it's not just the raw performance that's impressive, since those three GTX 580 cards were drawing a total of 732W compared with the single GTX 680 and its 195W peak.
We'll be writing a full review of the GeForce GTX 680, with full benchmark results and performance conclusions soon. But from what I've seen of the card so far, it's looking like a bit of a game changer, and definitely something that I'd welcome in my own gaming rig.