BLU Meanies: Data In-memory

Cover of "Sgt. Pepper's Lonely Hearts Clu...

Cover via Amazon

IBM is presenting a DB2 Tech Talk that compares the BLU Accelerator to HANA. There are several mistakes and some odd thinking in the pitch so let me address the issues as a way to explain some things about HANA and about BLU. This blog will consider what data needs to be in-memory.

IBM like several others, continues to repeat a talking point along the lines of: “We believe that you should not have to fit all of you active data in memory…”. Let’s think about this…

Note that in the current release HANA has a constraint that all of the data in a single column, the entire vector that represents the data in that column, must be in-memory before it can be operated on. If the table is partitioned and partition-elimination is applied then the data in the partition for the column must be loaded in-memory. This is a real constraint that will be removed in a subsequent release… but it is not a very severe constraint if you think about it.

But let’s be clear… HANA does not require all data to be in-memory… it will read data from peripheral devices in and out as required just as BLU does.

Now what does this mean? Let’s walk through some scenarios.

First, let’s imagine a customer with 10TB of user data, per the scenario IBM discusses. Let’s not get into a whose product compresses better discussion and assume that both BLU and HANA will get 4X compression… so there is 2.5TB of user data to be processed.

Now let’s imagine a system with only a very little memory available for data. In other words, let’s configure both BLU and HANA so that they are full columnar databases, but not in-memory databases. In this case BLU would operate by doing constant I/O without constraint and HANA would fail whenever it could not fit a required column in memory. Note that HANA might not fail at all… it would depend on whether there was a large single un-partitioned column that was required.

This scenario is really silly though… HANA is an in-memory database, designed to keep data in-memory from the start… so SAP would not support this imaginary configuration. The fact that you could make BLU work out of memory is not really relevant as nowhere does IBM position, or reference, BLU as a disk-based column store add-on… you would just use DB2.

Now let’s configure a system to IBM’s specification with 400GB of memory. IBM does not really say how much of this memory is available to BLU for data… but for the sake of argument let’s ignore the system requirements and assume that BLU uses one-half, 200GB, as work space to process queries so that 200GB is available to store data in-memory. As you will see it does not really matter in this argument whether I am spot on here or not. So using IBM’s recommendation there is now a 200GB cache that can be used as data is paged in and out. Anyone who has ever used a data warehouse knows that caching does not work well for BI queries as each query touches large enough volumes of the data to flush the cache… so BLU will effectively be performing I/O for most queries and is back to being an out-of-memory columnar database. Note that this flushing issue is why the in-memory capabilities from Oracle and Teradata pin certain tables into memory. In this scenario HANA will operate exactly as BLU does with the constraint that any single column that in a compressed form exceeds 200GB will not be able to be processed.

Finally let’s configure a system with 5TB of memory per SAP’s recommendation for HANA. In this case BLU and HANA both fit all of the data in-memory… with 2.5TB of compressed user data in and 2.5TB of work space… and there is no I/O. This is an in-memory DBMS.

But according to the IBM Power 770 spec (here) there is no way to get 5TB of memory on a single p770 node… so to match HANA and eliminate all I/O they would require two nodes… but BLU cannot be deployed on a cluster… so on they would have to deploy on a single node and perform I/O on 20% of the data. The latency for SSD I/O is 200Kns and for disk it is 10Mns… for DRAM it is 100ns and HANA loads full cache lines so that the average latency is under 20ns… so the penalty paid by BLU is severe and it will never keep up with HANA.

There is more bunk around recommendations for the number of cores but I can make no sense of it at all so I do not know where to begin to debunk it. SAP recommends high-end Intel servers to run HANA. In the scenario above we would recommend multiple servers… soon enough there will be Haswell servers with 6TB of DRAM and this case will run on one node.

I have stated repeatedly that anytime a vendor presents a slide comparing their product to their competitors you should immediately throw them out… it will always be twisted. Don’t trust them. And don’t trust me as I work for SAP. But hopefully you can see some logic in my case. If you need an IMDB then you need memory. If you are short of memory then the IMDB operates like a columnar RDBMS with a memory cache. If you are running a BI query workload then you need to pin data in the cache or the system will thrash. Because of this SAP recommends that you get enough memory to get all of the data in… we recommend that you operate our in-memory database product in-memory…

This really the point of the post. The Five Minute Rule informs us about what data should be in-memory (see here). An in-memory database is designed from the bottom up to manage hot data in-memory. The in-memory add-ons being offered over legacy systems are very capable and should not be ignored… and as the price of memory drops the Five Minute Rule will suggest that data in-memory will account for and ever larger percentage of your EDW. But to offer an in-memory capability and recommend that you should keep the bulk of the data on disk is silly… and to state that your product has a competitive advantage because you do not recommend that all of the data managed by your in-memory feature be kept in-memory is silliness squared.

10 thoughts on “BLU Meanies: Data In-memory

  1. Hi Rob, I have mentioned in the past that I’m very approachable and if there are issues with any of the material I have presented I’m glad to have folks reach out to me directly. The presentation you refer to I think was fair and balanced (yes I have a bias and I’m always clear about who I work for). I called out what is the same and what’s different between BLU and HANA. http://www.idug-db2.com/?commid=84925

    Now for your comments, I was very clear in the presentation to call out that only the “active” data in HANA has to fit in memory. It’s not however, to my understanding, just one column that is being worked on as you outline here…it’s all the columns that are currently active in all currently running queries. Correct me if that’s not true.

    With respect to keeping the data in memory you say “This is a real constraint that will be removed in a subsequent release”….but then you go on to say why databases that don’t have this constraint are not, in your opinion, real in-memory databases. Does that mean that in your opinion HANA will no longer be an in-memory database after they remove this “real constraint”?

    I think we have a difference of opinion on what in-memory means. In the DB2 space we mean that we are focused on in-memory analytics (not just keeping data in memory). By that I mean that we take special care to load the L1/L2 and L3 caches with data and pack the registers full of multiple values so that we can work on an entire vector of data at once. You clearly can’t fit a full column into a register or even into L3 cache so you stage it back in main memory. But in HANA it has to fit in main memory while in DB2 we will prefetch portions of the column into main memory while we simultaneously prefetch data from RAM into the cache in advance of it being needed in the cores registers. This keeps the CPU from stalling even when the data can’t fit in memory (if possible). By the way your assumptions about how scans flush the cache are incorrect about BLU. We have some patented techniques on scan friendly caching for columnar data. A new video about it will be on this site soon if you want to see how we avoid flooding main memory when a scan occurs http://bit.ly/1a5jvJt

    As for the sizing “bunk”, I provided the reference on the slide to the page on the SAP website that shows how to size HANA along with sizing cores which is in the OSS Note on that same referenced site. And I wasn’t specifically comparing to a 770 (there are models that hold more RAM by the way but I’m sure you know that). And I don’t want to diverge here too much but you mention BLU is not clustered for BW and yet you don’t mention that HANA is not cluster capable for ECC (but I digress 🙂

    If others listen to the webcast, I clearly said many times that it is in clients best interest to run them both side by side to see which solution is actually best for their given workloads. In some cases all data fits in memory on both sides that would be great…if not then you have the choice to buy a bigger machine on one side or choose the price/performance point on the curve that fits your budget on the other side. And I think as you point out HANA will get there when they remove the constraint of having to have all “active” data fit in memory.

    Like

    • First, my apologies, Chris. I was not aware that you developed the material or I would have contacted you first.

      In HANA the column vectors required for every running query must be in-memory… but not the entire table. As I understand it this constraint will be lifted in an upcoming release.

      Once lifted I would expect HANA will be able to operate as an out-of-memory column store and data will stream into memory as required. I am not trying to quibble over a static label as to whether BLU or HANA is or is-not an IMDB. But if you implement BLU or HANA with the majority of the data on peripheral devices then it would not be all that accurate to say that the implementation, not the product, was acting as an IMDB.

      HANA uses all of the same cache management techniques you describe fro BLU. But the performance improvements from eliminating I/O are several orders of magnitude greater than the improvements from effective use of cache… so we recommend keeping all of the data in-memory to achieve these gains. It is a recommendation that makes sense and to suggest that BLU has an advantage based on recommending more I/O does not seem sensible. It is, after-all, a recommendation not an architectural advantage.

      I’ll look forward to seeing the vid, Chris. My assumption is that if memory is filled with data from table X and a query is started that needs data from table Y then there will be pressure on memory and I/O will result. If you have found a way to put 10 lbs. of data in a 5 lb. bag I will be very impressed.

      The bottom line I think is as I suggested… if you have a 10TB problem and deploy 400GB of main memory then you have deployed a columnar DB with a large cache. This is not inherently a bad thing. If you deploy enough memory to eliminate I/O and run in-memory the performance benefits will outweigh the extra price… these are the economics that are causing both IBM and SAP to build in-memory capabilities.

      I promise to send you a draft of any future posts on BLU… that way you can correct mistakes and suggest tweaks to the messaging. But I do not see any mistakes in my post… so I’ll stand by the content even if I owe you a bit of an apology for the tone.

      Like

    • “By that I mean that we take special care to load the L1/L2 and L3 caches with data and pack the registers full of multiple values so that we can work on an entire vector of data at once.”

      Sounds like a lot of marketing to me. Do you have any technical docs explaining these claims?

      Like

      • Hiya,

        The concept of representing columns as vectors and using super-computing instructions: vector processing and SIMD; to perform relational algebra is not a dark technological secret. Everyone is doing it: SAP in HANA, Microsoft in xVelocity, IBM in BLU, and Oracle with their to be released in-memory option. Here are some details in a public paper: http://www.vldb.org/pvldb/2/vldb09-327.pdf So even I, a professional doubter, do not doubt these claims by IBM and others…

        Rob

        Like

      • Rob we should create an organization for professional doubters…I know a lot of folks that would love to join 🙂

        Konalndy, Rob is right, there is lots of research literature on this stuff that can make most folks head spin with the level of details (especially those IEEE and Sigmod papers). Here is one on Blink which was research started in 2007 on this topic and was the predecessor to DB2 BLU from an IEEE conference http://sites.computer.org/debull/A12mar/blink.pdf
        There are many others…just search for “IBM Blink” in google and you will find lots of technical papers on the topic (and references to prior works is always a good place to look at the end of any of these technical papers as they even references works from other vendors).

        Professional Doubters Unite !

        Like

      • Sorry, i guess I was referring more to the notion of loading “data” into L1/L2/L3 caches – a claim which I have also seen SAP make regarding HANA – a claim which I believe no more with HANA than I do with DB2. Unless I see some technical meat behind it, anyway.

        I would love to be proven wrong here, but I’ve been put over by too many marketing/sales guys to even bother to keep up pretenses any more. No offence intended, you understand. 🙂

        Professional Doubter – I may have to borrow that title. 😉

        Like

      • I’ve been trying to dig further back, Chris… I think that both the IBM and the SAP papers are based on some previous papers.

        I guess I’ll just leave this and hope that the readers get that all of the major players: SAP, IBM, Microsoft, and Oracle will have cache-aware SIMD/Vector columnar capabilities in the next year. This is what I called L3 columnar support in my “Who is how columnar?” posts. The players who do not seem on track here are Greenplum, Teradata, and Netezza. These L3 capabilities should provide a 10X-30X performance improvement over those without the functionality: .75X due to reduced CPU stalls, 15X due to cache utilization, 8X from SIMD, 10X from AVX.

        Bottom line: it is not a deep dark secret how to do this…

        Like

  2. No need to apologize Rob, just want you to know I’m happy to chat about the technology any time.

    “If you have found a way to put 10 lbs. of data in a 5 lb. bag I will be very impressed.” I think I may have a good analogy here but first some of the technology background. As you are well aware, columnar changes a lot of things. One is that some columns (and therefore some data pages) are more important than others. That is columns that are part of predicates (local or join doesn’t matter) are usually more important than those columns used in the query but for just things like select lists. That is because if it’s my join predicate or WHERE clause predicate there is a high propensity that it will also be in the query of someone else. Now normal LRU algorithms may catch this because everyone it hitting them but we take it further in that for full column scans we don’t flush the pool with this column if it doesn’t fit and we don’t do what row based buffer algorithms do which is recycle those pages quickly.

    So in the analogy it’s much like when someone goes hiking. I may have 10 lbs of “stuff” but I usually keep my map in my hands (because I’m a bad hiker and don’t know where I’m going 🙂 And I may keep other key items on my belt like a water bottle…things I’m using often but don’t fit in my hand. And then at the top of my bag is stuff I need to get to quickly like my raincoat (don’t ask about my hiking experience) and at the bottom of the bag is the stuff I won’t use often. And in the worst case I leave some stuff in my car and don’t even bother to bring it with me (which like HANA is columns not currently active). So by having the “stuff” that I use often in memory and the stuff I refer to but may not be as critical if I cycle through it because I’m only looking for a few of them I can be efficient with my use of memory.

    Or said another way, I really only want 5lbs of stuff in the bag and I may carry 3 pounds in my hands for very fast access and leave 2 pounds back at camp…saves me from having to buy a more expensive 10lb bag and carry it all, all the time.

    Like

Comments are closed.