Some Database Performance Concepts

I’m working on a new idea… it may or may not pan out… but here are some concepts for your consideration… with some thoughts on their performance implications.

First a reminder… a reality check. In my experience if you POC two databases at about the same price point…and one is 30% faster than the other, 1.3X, then 50% of the time the faster DBMS will win the business. If one DBMS is 2X faster… then it will win the business 90% of the time. The 10% where the faster product loses will be because of internal politics or, for an existing application, due to the migration costs. Note that IMO it is silly to do a POC if you know up front that you will not pick the winner.

Now to the concepts… Note that these are ballpark numbers to help you think about trade-offs…

102312_2108_HopFestP1Ex1.png

The latency to start fetching data from DRAM is 100 ns… from disk it is 10M ns. If we assume that a smart RDBMS pre-fetches 80% of the data into DRAM then we can assume that an in-memory DBMS has a 200,000X performance advantage over a disk-based system.

The latency to a Flash/SSD device is 100K-200K ns. With the same 80% pre-fetch assumption an in-memory DBMS will be 20,000X faster.

Note that neither of these models include data transfer times which will favor in-memory databases even more.

If we have a hybrid system with both disk and SSD and we assume that 90% of the reads hit the SSD and that both layers in the storage hierarchy achieve 80% pre-fetch then then the in-memory system will be 38,000X faster. If fewer than 90% of the reads hit the SSD, then the latency goes up quickly.

These numbers form the basis for selecting in-memory caches like Teradata’s Intelligent Memory option as well as in-memory offerings from IBM, Microsoft, Oracle and SAP.

For typical data warehouse workloads column compression will provide around a 2.5X performance boost over row compression. This has two implications: you will get 2.5X better performance using column storage and you will get 2.5X more data into the faster levels of your storage hierarchy… more in SSD and more in-memory.

If we assume that a typical query only touches 10% of the columns in the tables addressed… then column projection provides a 9X performance boost over a row store. Exadata does not support column projection in the storage layer… and other hybrid row-or-column systems provide it only for columnar tables.

If we assume that the average latency from the processor caches is 10ns (.5ns L1, 7ns, L2, 15ns L3) and the latency to DRAM is 100ns then an in-memory system which pre-fetches data effectively into the processor caches will be 10X faster than one which goes to DRAM. If we assume that a standard RDBMS which processes uncompressed standard data types (no vector processing) gets a 20% cache hit ratio then the advantage to a cache aware RDBMS which loads full cache lines is around 8X. HANA, BLU, and the Oracle in-memory products are cache aware and get this boost with some caveats.

BLU and the Oracle in-memory option are hybrid systems that often convert data to a row form for processing (see here for some data on Oracle). If we assume that they use the full columnar in-memory vector-based structures 50% of the time then these products will see a 4X performance boost. HANA recommends that all data be stored in a columnar form so it would often see the full 8x boost.

These vector-based processes also avoid the cost of decompression… and since they process compressed vector data they can fit more information into each cache line. There is another 20%-200% (1.2X-2X) boost here but I cannot estimate it closer than that.

Finally, the vector based processes use the high performance computing instruction sets (AVX2) offered on modern CPUs… and this provides another 10X+ boost. Again, BLU and Oracle will utilize the vector form less often than HANA so they will see a boost over products like Teradata… but not see as large a boost as HANA.

There are other features at play here… some products, like HANA, shard data in-memory to get all of the cores busy on each query. I have not been able to determine if BLU or Oracle in-memory are there yet? Note that this powerful feature will allow a single query to run as fast as possible… but the benefit is mitigated when there is a workload of multiple concurrent queries (if I have 4 cores and 4 queries running concurrently, one query per core, then the 4 queries will take only a little more time than if I run the 4 queries serially with each query using all 4 cores).

It is important to note that the Oracle In-memory option does not run in the storage component of an Exadata cluster… only on the RAC layer. It is unclear how this works with the in-memory option.

The bottom line is that in-memory systems are not all alike. You can sort of add up the multipliers to get in the ballpark on how much faster Teradata will be with the Intelligent Memory option… how much faster than that a hybrid row and vector-column system like BLU or Oracle In-Memory… and how much faster a pure in-memory system might be. One thing is for sure… these in-memory options will always make the difference in a POC… in every case including them or not will blow away the 2X rule I started with… and in every case the performance benefit will outweigh the extra cost… the price/performance is very likely to be there.

I know that is skipped my usual referencing so that you can see where I pulled these numbers from… but most of this information is buried in posts here and there on my blog… and as I stated up front… these are ballpark numbers. Hopefully you can see the sense behind them… but if you think I’m off please comment back and I’ll try to adjust…

6 thoughts on “Some Database Performance Concepts

  1. Thank you for walking us through that. As you’ve highlighted for me in the past, we need to be aware of whether the purchase decision is being based on a proof-of-concept or a benchmark. Success of a POC might be more related to things like functionality, ease-of-use, and integration into an existing environment – as well as potentially some form of performance hurdle. This discussion would seem to be more related to a benchmark where we are comparing a specific database workload – which may or may not be part of a POC. Bottom line, it is important for everyone involved to understand the specifics of the decision criteria, deliverables, and statement-of-work to “win” the engagement.

    Like

  2. Hi. Great blog! A clarification on IBM’s BLU Acceleration technology. BLU Acceleration is a pure parallel vector processing column store that is in-memory optimized. It is not a hybrid columnar-row store as implied above. It does not require that data be stored in both row and columnar organizations as other vendors do. Wasteful! Most customers use it as a pure columnar store (though it can be used in combinations with row-organized tables if desired). You may find the following links helpful:
    http://softwaretradecraft.com/when-ram-is-too-slow-how-dynamic-in-memory-processing-changes-the-game-for-data-analytics/
    http://ibmbluhub

    Like

    • Thanks, Sam. Let’s be clear though… A column store can be a hybrid row/column… Or a hybrid in-memory/disk-based… Or a combination row/columnar/in-memory/disk-based.

      It is not likely that most customers use the BLU accelerator to DB2 in a pure sense… They join rows and columns, in-memory and/or out.

      And a hybrid is not necessarily wasteful… Each architecture: row, columnar, in-memory, and out has advantages and disadvantages…

      BLU is a nice piece of work, I think. Underrated so far…

      Rob

      Like

      • Just to leave my 0.02 cents:
        1) BLU still missing table partitioning features, which many businesses rely on. It forces you into massive INSERT/DELETE operations.
        2) Missing also is the DPF feature, which could span a query beyond two or more nodes.
        3) Issue #2 may be taken seriously into account since some papers I read point out that BLU is a core-eater, with numbers > 70 cores to deal with a 100 TB data warehouse. This goes beyond most P Series systems may offer on a single hardware unit. This must be taken into account for the TCO simulation, since it pushes PVU licensing up and probably will ask for a HW upgrade.
        4) Summing up, I think it is not mature yet and its adoption may require some fine grained comparison with vertical appliances on the market due licensing and HW upgrade costs.

        Like

      • Hi Henrique,

        Your points are spot on. BLU is new… it will mature.

        Jeremy Rickard commented that BLU must shard in-memory since DB2 has always sharded across nodes… but BLU does not yet shard across nodes as I see it. But it will come.

        Readers of this blog should note that every in-memory database will be a core-eater… even the basic in-memory add-ons like the in-memory cache from Teradata will require hardware: cores and memory. These product will consume compute because they will not have to wait for I/O. Each in-memory query will be able to consume 100% of the CPU. This is a very good thing… not a bad thing. But it will require new thinking about workload management.

        I think that BLU is a fantastic product. As it matures it will compete very well against SAP HANA as a DBMS (it will not provide the Platform features HANA includes…) and it will eventually force SAP to price HANA competitively.

        Rob

        Like

Comments are closed.