
In my post here I suggested that database computing was becoming a special case of high-performance computing. This trend will bump up against the trend towards cloud computing and the bump will be noisy.
In the case of general commercial computing customers running cloudy virtualized servers paid a 5%-20% performance penalty… but the economics still worked for the cloud side.
For high-performance database computing it is unclear how much the penalty will be? If a virtualized, cloudy, database gives up performance because SIMD becomes problematic, priming the cache becomes hard, CPU stalls become more common, and there is a move from a shared nothing architecture to SANs or SAN-like shared data devices, then the penalty may be 300%-500% and the cloud databases will likely lose.
As I noted in the series starting here, there are lots of issues around high-performance database computing in the cloud. It will be interesting to see how the database vendors manage the bump and the noise. So keep an eye out. If your database of choice starts to look cloudy… if it becomes virtualized and it starts moving from a shared-nothing cluster to a SAN… then you will know which side of the bump they are betting on. And if they pick the cloudy side then you need to ask how they plan to architect the system to hold the penalty to under 20%…
I also mentioned in that series that in-memory databases had an advantage over peripheral-based databases as they did not have to pay a penalty for de-coupling the IO bandwidth that is part of a shared-nothing cluster. But even those vendors have to manage the fact that the database is abstracted… virtualized… away from the hardware.
If I were King I would develop a high-performance database that implemented the features of a cloud database: elasticity, easy provisioning, multi-tenancy; over bare metal. Then you might get the best of both worlds.