Why Geospatial Users and Developers Should Know Their GPU from their CPU

I've made the statement for a few years, which I believe was first prompted by Gary Smith at Green Mountain Geographics, that we in geospatial technologies need to spend more time with gamers. Smith focuses his work on 3D modeling and visualization and sees a direct connection. I got a taste of what's possible at last year's New York State Geospatial Summit when Adam Szofran of Microsoft explained how cleverly his team packages the geodata for Flight Simulator (Directions Magazine coverage). That discussion led me to naively believe that it was in the software arena that game developers could bring the most to the geospatial table. After speaking to Sumit Gupta, senior marketing manager for Tesla High Performance Computing products at Nvidia, I better understand the role of hardware, specifically the graphics processing unit, or GPU.

Nvidia has been around for 17 years and is best known as the company that makes the processors, the GPUs, for graphics cards used in gaming machines. The early chips were aimed at making computer game environments more realistic - making the streets look like streets and the characters' hair look like real hair. That hardware, though we didn't use the term, used the same principles of computing hardware as supercomputers (more on that later).

Graphics like these are made using Nvidia chips. (Click for larger view.)

In 2000, Nvidia began to develop chips that were aimed at more than just gaming; they were general purpose chips with software for easy programmability. One of the first groups of heavy users was involved in medical imaging. They wanted to take graphic image slices of the body and reconstruct them into 3D structures. They used the tools to write software that took advantage of these general purpose chips, sometimes called General Purpose Graphics Processing Units (GPGPUs).

The architecture behind these general purpose chips, rolled out in 2006/7, is referred to as CUDA. That name also graces the enhanced version of the C language used to take advantage of chips: "CUDA C."

What special properties are in these chips? The GPU is a massively parallel processor. That is, it can do many operations at the same time, like a supercomputer. Gupta offers the following very effective image to distinguish parallel processing from sequential processing. Let's say you have 20 students in a classroom, each with a cup of water on their desk. You ask each one to come up to the front of the room and pour the water into a bucket - one by one. It'd take a while, right? But what if you asked them all to come up at once and pour the water in the bucket? The process would go much faster, since all the actions would be done at the same time. Further, in the computing version, there would be no pushing and shoving because the CUDA parallel programming model makes sure that all the "parallel" actions are well coordinated and run smoothly.

When a programmer uses CUDA C to code processes to use the GPU, the aim is to have all the students pouring water at once. Or, to translate it to familiar processes in geospatial, to manipulate each pixel in an image, all at the same time. That's a bit of an oversimplification. In fact, most likely, one "chunk" of the image, that is one group of pixels, would be processed using the GPU, then a second set of pixels, then a third. But for each chunk, all the pixels would be processed at the same time.

Now, it turns out that the manipulation of pixels, the orthorectification of aerials and the pan sharpening of images, is very much like the graphics problems in gaming. Thus, they are a perfect fit to be taken out of the queue of the central processing unit, the CPU, which does things sequentially, and sent over to the GPU, to be handled in parallel.

So, what does that mean in practice? Gupta pointed me to a great example: a slope calculation based on a digital elevation model (DEM). Basically, the GIS software is asked to find the slope at each point and make a new surface with those values. If you do this without taking advantage of a GPU installed in your machine, using Manifold 8, it takes just about a minute. When you take advantage of the GPU? Two seconds. You can watch the video if you don't believe me.

To be clear, best practices in coding suggest that software programs look for, and then use, a GPU if it is available. The user typically does not need to "enable use" of a GPU. Manifold, by the way, coded about three dozen functions in version 8 to take advantage of the GPU.

Manifold is not the only GIS company taking advantage of Nvidia's GPUs; PCI Geomatics and DigitalGlobe (press release) have, as well. For a GIS project, the difference in processing speed between a day and an hour is huge. For satellite data, freshest is best, so speeding the time from capture to customer makes more money. There are defense players exploring using GPUs for geospatial work, too.

While most of these examples focus on raster processing, GPUs are good at floating point operations, too, including modeling. Consider "ray tracing" - that's the term graphic designers use to describe how to make light in an image act just the way it does in "real life." If you have all the rules about how much light reflects and is absorbed on each different kind of surface, you can make very realistic images. But, in essence, you are sending out each ray from the sun (or the light source) and seeing what happens. What if you send them all out to be processed at once on a GPU? It's much faster. Once you consider that similar types of "wave modeling" are common in GIS work (working with radar, cellular antennas, sound dissipation, line of sight), it's clear there's a lot of spatial processing that can be offloaded to a GPU to speed things up.

If these ideas are new to you (as they were to me) you might want to see if the hardware you use for your geospatial work has a GPU. It might be from Nvidia or AMD. You'll also want to take a look at your GIS software to see if it's written to take advantage of that hardware. In the case of Nvidia, there will be a statement that it supports the CUDA or OpenCL standard. Software written for ATI uses Stream Technology to take advantage of the GPU, though as of now, there does not appear to be any GIS software written for that technology.