A closer look at Nvidia GTC: Big data taps into GPU horsepower, and the future of the GPGPU

Traditionally, powerful graphics processors have been useful mostly to gamers looking for realistic experiences along with engineers and creatives needing 3D modelling functionality. After spending a few days at Nvidia's GPU Technology Conference (GTC) this year, it is very clear to me that the uses for GPUs have exploded – they have become an essential element in dozens of computing domains. As one attendee suggested to me, GPUs could now be better described as application co-processors.

GPUs are a natural fit for compute-intensive applications because they can process hundreds or even thousands of pieces of data at the same time. Modern GPUs can have several thousand reduced-instruction-set cores that can operate in groups across large amounts of data in parallel. Nvidia's release of its CUDA (Compute Unified Device Architecture) SDK in 2007 helped usher in an era of explosive growth for general-purpose programming on GPUs (often referred to as GPGPU).

Imaging and vision have a lot in common with graphics

Two of the most active markets for GPGPU computing are image processing and computer vision. Much like computer graphics, they both require running algorithms over potentially millions of elements in real-time – exactly what a GPU is designed to do well.

One of the most amazing demonstrations of the power even a mobile GPU can bring to bear on computer vision is Google's Project Tango. With only a tricked-out smartphone, Tango records over 250,000 3D measurements each second, using them to create a highly accurate map of the surrounding building – including rooms, furniture and stairwells – as the user walks around. To do that, it uses not just a state-of-the art Nvidia mobile GPU – which project lead Johnny Lee points out has more computing horsepower than the DARPA-challenge-winning autonomous vehicle from 2005 – but two custom vision processors.

To give you a sense of how fast a GPU can accomplish tasks using parallel processing, Mythbusters did this amusing and instructive demonstration for Nvidia:

Big data is a lot like big graphics

It didn't take long for the big data craze to tap into GPU horsepower either. For example, startup Map-D has found a way to use GPUs and their memory to implement an ultra-high-speed SQL-compatible database. This allows it to analyse nearly unlimited amounts of data in near-real-time. One of their eye-opening demos is an interactive data browser of tweets worldwide. The system allows real-time analysis of one billion tweets using eight Nvidia Tesla boards on a server. Map-D won the Emerging Company Showcase at GTC with its cool demos, but it wasn't the only startup showcasing the use of GPUs for big data hacking. Brytlyt is using Nvidia GPU cards to run queries that it says would take Google's BigQuery 30 years in just six minutes. Brytlyt's software will enable large retailers to perform better interactive promotions and targeted marketing by allowing them to react quickly to customer location and actual purchases.

GPUs are the new supercomputers

Strolling through the dozens of poster presentations at GTC made it clear that researchers in just about every branch of science have harnessed the GPU to help them. Research in areas including Computational Fluid Dynamics and Machine Learning would have required a supercomputer – or a high-performance compute cluster – until the advent of the modern GPU. In some cases, like neural networks, GPUs have helped bring what was a dormant branch of computer science come back to life.

Clarifai, for example, has built on neural network technology that originated in the eighties to implement a machine-learning-based image recognition system that it claims can tackle the job of automatically categorising not just all of your images, but those on the web. Clarifai has been setting some image recognition speed records, although a quick hands-on with their demo version shows it has a long way to go before its auto-tagging features are generally useful.

Tools and hardware for GPGPU development

Unless you have been trained as a graphics programmer, developing software for a GPU is a mind-bending task. Because they were originally designed strictly for graphics, you need not only to think in terms of mapping kernels of code onto streams of data, but read documentation full of discussions about shader cores, vertices, and fragments.

Fortunately, now that GPGPU is a huge industry, there are many tools that have been built to help out the rest of us. One of the first of these was Nvidia's CUDA programming language and environment, which is now complemented by the industry standard OpenCL (Open Computing Language). Both of these use C-like interfaces to allow you to specify your code and data. Nvidia has also extended its Nsight visual development environment to support CUDA programming. The industry hasn't stopped there, though. Companies like ArrayFire have built entire high-level programming environments for the GPU, allowing application developers of all kinds to write code that is independent of CPU and GPU implementations.

Nvidia has even been catering to the GPGPU market with custom hardware. Its Tesla brand of server hardware cards contain massively powerful GPUs – with as many as several thousand cores and capable of processing multiple at the rate of several Teraflops – ideal for massive computing tasks. AMD hasn't been idle either, with its HSAIL initiative helping its users take advantage of what it calls Heterogeneous System Architecture (HSA).

The future of GPGPU

The same forces that drove the creation of the GPU itself – as a specialised processor for graphics – are now creating new, special-purpose hardware that can supercede the GPU for particularly popular applications. Those include image signal processors now found in most smartphones, like the Movidius Myriad 1 chips in Tango, and even full-custom chips for Bitcoin mining. So there will continue to be a race between the high-volume, low-cost GPU products and specialised chips for specific applications. The good news is that this means a lot more computing horsepower for a lot less money in the future.