CPUs and HW for HANA, BLU, Hekaton, and Oracle 12c

CPUs from retired computers waiting for recycl...

(Photo credit: Wikipedia)

This short post is intended to provide a quick warning regarding in-memory columnar and cpu requirements… with a longer post to follow.

When a row is inserted or bulk-loaded into a DBMS, if there are no indexes, the amount of cpu required is very small. The majority of the time is spent committing a transaction is the time to write a log record to persist the data.

When the same record is reformatted into a column the amount of processing required is significantly higher. The data must be parsed into columns, the values must be compressed, dictionaries may be updated, and the breadcrumbs that let the columnar data be regenerated into rows must be laid. Further, if the columnar structure are to be optimized then the data must be ordered… with a sort or some kind of index structure. I have seen academic papers that suggest that for an insert columnar processing may be 100X more than row processing… and you can see why this could be true (I apologize for not finding the reference… I’ll dig it up… as I recall I read it in a post some time back by Daniel Abadi).

Now let’s think about this… several vendors are suggesting that you can deploy their columnar features with no changes required… no new hardware… in-place. But this does not ring true if the new columnar feature requires 100X extra CPU cycles per row… or 50X… or 10X… unless you are running your database on an empty server.

This claim is a shot at SAP who, more honestly, suggests new hardware with high-end processors for their in-memory columnar product… but methinks it is marketing, not architecture, from these other folks.

3 thoughts on “CPUs and HW for HANA, BLU, Hekaton, and Oracle 12c

  1. Hi Rob,

    I respectfully point out that the *majority* of CPU cost in bulk loading data is in fact converting ASCII tuples to internal data type representation.

    All that aside, I think your are correctly pointing out that Oracle’s words about their future 12c “columnar” approach will indeed take extra CPU because the material clearly states that the column transformation happens on the fly as it is an in-memory only representation. The big question is how that cost will amortize over what improvement there is to be had from the columnar acceleration.

    At this point Oracle has not taken this clearly generic database feature and tied to their engineered systems. I’ve seen the keynote and Mr. Ellison suggests all ports will have this in-memory feature. Unless he changes his mind–and that shouldn’t that make maintenance-paying customers upset across the board–I’ll test this in-memory columnar transformation so I better understand it. I think it could actually work…but not as well as a true in-memory columnar database.

    Like

  2. It’s certainly my experience that production SQL Server machines often have a lot of spare CPU cycles since the rise of multicore processors.

    Whether Hekaton can put them to work effectively remains to be seen.

    Like

Comments are closed.