MPP, IMDB and Moore’s Law

In the post here I listed the units of parallelism (UoP) applied by various products on a single node. Those findings are summarized in the table below.

Product

Version/HW

Cores per Node

UoP per Node

Notes

Teradata EDW 6700H

16

32

Uses hyper-threads.
Greenplum DCA UAP Edition

16

8

Recommends 1 Segment for each 2 cores. Maybe some multi-threading per query so it could be greater than 8 on the average… and could be 16 with hyper-threads… but not more than 32 for sure.
Exadata X3

12

12-24

Maybe only 12… cannot find if they use hyper-threads.
Netezza Striper

16

16

May use hyper-threads but limited by 16 FPGAs.
HANA Any Xeon E7-4800

40

80

Uses hyper-threads.

A UoP is defined as the maximum number of  instructions that can execute in parallel on a single node for a single query. Note that in the comments there was a lively debate where some readers wanted to count threads or processes or slices that were “active” but in a wait state. Since any program can start threads that wait I do not count these as UoP (later we might devise a new measure named units of waiting that would gauge the inefficiency in any given design by measuring the amount of waiting around required to keep the CPUs fed… maybe the measure would be valuable in measuring the inefficiency of the queue at your doctor’s office or at any government agency).

On some CPUs vendors such as Intel allow two threads to execute instructions in-parallel in a core. This is called hyper-threading and, if implemented, it allows for two UoP on a single core. Rather than constantly qualify the statements for the rest of this blog when I refer to cores I mean to imply hyper-threads.

The lively comments in the blog included some discussion of the sort of techniques used by vendors to try and keep the cores in the CPU on each node fed. It is these techniques that lead to more active I/O streams than cores and more threads than cores.

For several years now Intel and the other CPU manufacturers have been building ever more cores into their products. This has allowed them to continue the trend known as Moore’s Law. Multi-core is now a fact of life and even phones, tablets, and personal computers have multi-core chips.

But if you look at the table  you can see that the database products above, even the newly announced products from Teradata and Netezza, are using CPUs with relatively few cores. The high-end Intel processors have 40 cores and the databases, with the exception of HANA, use Intel products with at most 16 cores. Further, Intel will deliver Ivy Bridge processors to the market this year with 120 cores. These vendors know this… yet they have chosen to deliver appliances with the previous generation CPUs. You might ask why?

I believe that there is an architectural reason for this (also a marketing reason covered here).

It is very hard to keep 80 cores fed with data when you have to perform block I/O. It will be nearly impossible to keep the 240 cores coming with Ivy Bridge fed. One solution is to deploy more nodes in a shared-nothing configuration with fewer cores per node… but this will be expensive requiring more power, floorspace, administration, etc. This is the solution taken by most of the vendors above. Another solution is to solve the problem without I/O with an in-memory database (IMDB) architecture. This is the solution taken by SAP with HANA.

Intel, IBM, and the rest will continue to build out using the multi-core approach for the foreseeable future. IMDB products will be able to fully utilize this product. Other products will struggle to take full advantage as we can see already… they will adapt and adjust and do what they can… but ultimately IMDB will win, I think… because there is just no other way to keep up as Moore’s Law continues to drive technology… no other way to feed the CPU engines with data fast enough.

If I am right then you will see more IMDB offerings from more vendors, including from the major vendors in the near future (note that this does not include the announcements of “database in memory” from Oracle which is not by any measure an in-memory database).

This is the underlying reason why Donald Feinberg (and Timo Elliott) are right on here. Every organization will be running in-memory… and soon.

5 thoughts on “MPP, IMDB and Moore’s Law

    • I read Curt Monash’s blog, Sangram. He is not very specific about what is behind this guess. But I would look at the comments here (http://wp.me/p1a7GL-ft) where Dan Graham… the Teradata General Manager in charge of HW says “Teradata won’t need 120 AMPs/node anytime soon.” in response to my prompting about Ivy Bridge.

      Like

  1. It’s hard to tell if it’s that “traditional” DB don’t scale up (use all the cores that can be put in a single SMP box) or it’s the opposite case and the IMDB don’t play nicely with scale out.
    Optimizing inter-node communication is a big deal for multinode systems and the difference in performance when accessing non-local data plays a big role in query planning strategies.

    Like

    • Marco,

      I do not see thattThere is no architectural link between in-memory and shared-nothing. HANA, for example, offers both techniques.

      My point is that if you optimally use all of the power within a single node, including the use of MPP techniques across the cores within that node then you can deploy fewer nodes with more cores. The current architectures promoted by Teradata, Greenplum, Netezza, and Exadata scale out first using nodes instead of using cores and this is a problem.

      Further, to your point, scale out requires optimal inter-node communication. I would add that the further you scale out the more important this becomes. With Ivy-bridge HANA will deploy 120 cores per node. Teradata, the next closest player will require 4x more nodes to deploy that many cores. This is significant.

      Thanks for the comment…

      Rob

      Like

      • Hi Rob,
        while I worked at HP several years ago boxes with 64 and 128 cores were not uncommon in the Superdome range and were often used to run traditional databases (Oracle usually) so scaling up using a lot of cores is a technical challenge that was overcome a number of years ago.
        Dealing with a high number of cores is relatively new challenge only in the x86 world: IBM power-based system and Sun Sparc-based system offered over 100 cores per node a number of years ago.

        Sorry for the very concise statement I made about internode communication.
        What I wanted to covey is that the greater the performance discrepancy between the different storage used (local memory and remote memory in the case of HANA) the greater the challenge in dealing efficiently with the need for non local data.
        While some data models do lend intrinsically to techniques that limit or nullify the performance impact when scaling out (star and snowflake schemas where you can duplicate the dimensions in all the nodes avoiding the need of data movement at runtime as long as you don’t join large non co-located fact tables) other data models and workloads don’t play nicely unless a lot of smartness is built into the plan optimizer to limit as much as possible the data movement.

        The real big deal with large SMP boxes is the cost per core: in a 8-socket system is way higher compared to the 2-sockets configurations both because of the more complex mainboard layout and the higher cost of the CPU with more cores per socket and supporting chipsets.

        Oracle X3-8 today offers 80-cores nodes at the database level as per the datasheet: http://www.oracle.com/us/products/database/exadata-db-machine-x3-8-1851252.pdf
        The X3-8 platform in my understanding is usually targeted at workloads that don’t play too nicely with a lot of inter-node communication.

        Having support for multi-node configurations and getting good value from the infrastructure for a wide set of workloads are not always the same thing.

        Like

Comments are closed.