GPU Accelerated Databases

February 27, 2008

I was reading about General Purpose GPU programming on http://www.gpgpu.org  and I started to think about how GPU could be leveraged in database technology. One use that came to my mind immediately was for geospatial and geometrical data. I’m far from being an expert in that matter, but I would think that one could offload most of the calculations to a GPU. Both raster and vectorial maps can benefit the use of a GPU since it can handle both bitmaps and vectorial data like a real champion.

 Another use that I think might work is for indexing data. If you represent data using geometrical patterns, it would be thinkable to use a GPU to perform pattern matching in a very efficient manner due the highly parallel nature of those processing units. If you combine those patterns with set theory, you could define patterns that encompass the actual data. By combining geometrical pattern, those indices would be able to determine the data that is to be included in queries involving a wide range of aggregations and computations.

I’d be curious to see with the CLR integration in SQL Server, if one could call DirectX libraries to offload some work to a GPU.

Feel free to comment on the post, I’m learning and thinking out loud here! I’ll probably be posting more thoughts about this in the coming posts. I’m already thinking about applications for data minining and OLAP data…

What would happen if you were to combine Boinc  and Facebook? Probably the most powerful data mining application ever created.

The pure computing power available through the workload distribution of Boinc and the wealth of information present on Facebook would produce one scary application. The kind of application Big Brother would like to have… Maybe he already has! One of the most time consuming task in gathering intelligence is to identify people and their relations with each other. It’s one of the main feature of Facebook. If you combine this with the fact that people increase this knowledge by tagging people in pictures and with the rest of the information that is available, things get very interesting.

Boinc offers 1.06 PetaFLOP/sec, which is twice what BlueGene/L, the fastest supercomputer on the planet delivers. The more users get involved with Boinc, the more its capacity increases. Why would the government pay to build such a large and resilient supercomputer? I would give tax credit to people who contribute to the national computing resource, because it has value. I’m surprised there isn’t a market for that yet! Just as a nation has oil and gold reserves, computing resources are now a way to rank a nation’s power.

 If I would be a hacker, the first thing I would hack is Boinc! I would use it to break Facebook and then use its crunching power to process the data to find Usama Bin Laden and get the 5 million dollars! :-p