A Short DW DBMS Market History: HANA, Oracle, DB2, Netezza, Teradata, & Greenplum

Here is a quick review of tens years of data warehouse database competition… and a peek ahead…

Maybe ten years ago Netezza shook up the DW DBMS market with a parallel database machine that could compete with Teradata.

About six years ago Greenplum entered the market with a commodity-based product that was competitive… and then added column store to make it a price/performance winner.

A couple of years later Oracle entered with Exadata… a product competitive enough to keep the Oracle faithful on an Oracle product… but nothing really special otherwise.

Teradata eventually added a columnar feature that matched Greenplum… and Greenplum focussed away from the data warehouse space. Netezza could not match the power of columnar and could not get there so they fell away.

At this point Teradata was more-or-less back on top… although Greenplum and the other chipped away based on price. In addition, Hadoop entered the market and ate away at Teradata’s dominance in the Big Data space. The impact of Hadoop is well documented in this blog.

Three-to-four years ago SAP introduced HANA and the whole market gasped. HANA was delivering 1000X performance using columnar formats, memory to eliminate I/O, and bare-metal techniques that effectively loaded data into the processor in full cache lines.

Unfortunately, SAP did not take advantage of their significant lead in the general database markets. They focussed on their large installed base of customers… pricing HANA in a way that generated revenue but did not allow for much growth in market share. Maybe this was smart… maybe not… I was not privy to the debate.

Now Oracle has responded with in-memory columnar capability and IBM has introduced BLU. We might argue over which implementation is best… but clearly whatever lead SAP HANA held is greatly diminished. Further, HANA pricing makes it a very tough sell outside of its implementation inside the SAP Business Suite.

Teradata has provided a memory-based cache under its columnar capabilities… but this is not at the same level of sophistication as the HANA, 12c, BLU technologies which compute directly against compressed columnar data.

Hadoop is catching up slowly and we should expect that barring some giant advance from the commercial space that they will reach parity in the next 5 years or so (the will claim parity sooner… but if we require all of the capabilities offered to be present there is just no way to produce mature software any faster than 5 years).

Interestingly there is one player who seems to be advancing the state of the art. Greenplum has rolled out a best-in-class optimizer with Orca… and now has acquired Quickstep which may provide the state-of-the-art in bare metal columnar computing. When these come together Greenplum could once again bounce to the top of the performance, and the price/performance, stack. In addition, Greenplum has skinnied down and is running on an open source business model. They are very Hadoop-friendly.

It will be interesting to see if this open-source business model provides the revenue to drive advanced development… there is not really a “community” behind Greenplum development. It will also be interesting to see if the skinny business model will allow for the deployment of an enterprise-level sales force… but it just might. If Pivotal combines this new technology with a focus on the large EDW market… they may become a bigger player.

Note that was sort of dumb-luck that I posted about how Hadoop might impact revenues of big database players like Teradata right before Teradata posted a loss… but do not over think this and jump to the conclusion that Teradata is dying. They are the leader in their large space. They have great technology and they more-or-less keep up with the competition. But skinnier companies can afford to charge less and Teradata, who grew up in the days of big enterprise software, will have to skinny down like Greenplum. It will be much harder for Teradata than it was for Greenplum… and both companies will struggle with profitability for a while. But it is these technology and market dynamics that give us all something to think about, blog about, and talk about over beers…

13 thoughts on “A Short DW DBMS Market History: HANA, Oracle, DB2, Netezza, Teradata, & Greenplum

  1. Just two thoughts – First – are you making any distinction between DW performance and actual market penetration? I think you are talking market penetration, in which case the incumbents have an edge (if they know how to use it) in spite of the fact that other products have a technical edge.
    Second – In Silicon Valley right now much of the attention is on Spark. I suspect when you reference Hadoop, you mean the Hadoop ecosystem, including Spark. There are Spark pure plays – by that I mean companies that are only focused on Spark as a product or service (e.g. Databricks), but they aren’t publicly traded (although IBM has announced a commitment).
    Database and DW are becoming almost a commodity thanks to open source and cloud hosting. I’m not sure how any database vendor, including the incumbents like Teradata will be able to retain the margins they once had. I don’t think it was dumb-luck that you interpreted the market dynamics that way, just sound logic 🙂 .

    Like

    • Incumbents have an edge by definition… I’m talking about the edges that change due to features here.

      I should be clear that when I say Hadoop I mean the Eco-system. Spark is just part of it and not necessarily the most interesting part… Just one of the very interesting parts.

      Finally… I do not believe that databases are commoditized. “Commodity” means that they are all the same and price is the only differentiator. This is just not the case for databases as the little history points out. There is, and will continue to be, differentiation IMO.

      Like

  2. I’ve never subscribed to the concept of X being ‘better’ than Y because X has a certain feature that Y lacks. “Different strokes for different folks” as they say!

    More interesting, in my opinion, is the impact of the new players (Hadoop, Spark et al) on the established MPP vendors, especially Teradata. Netezza found a home at IBM, Greenplum found a home at EMC/Pivotal and SAP acquired/developed HANA in-house, which leaves Teradata out there on it’s own.

    “…Hadoop might impact revenues of big database players like Teradata right before Teradata posted a loss… but do not over think this and jump to the conclusion that Teradata is dying”.

    The market has priced the TDC stock at close to the 2007 IPO price. It has under performed cash in 8 years since IPO, with a 33% drop in the last 6 months alone.

    As TDC tries to diversify away from the core enterprise EDW offering, it’ll be interesting to see how a proprietary software company makes a sustained profit out of re-selling someone else’s distribution of open source software.

    Interesting times indeed…

    Like

    • I agree, Paul. Teradata is a sound company with a strong and loyal install base. Hadoop has taken away their high end market… And this impacts revenues and profitability.

      But a loss has nothing to do with stock prices. A loss is a loss.

      And while I agree, and have argued for, finding the database product that solves your problem set for the least coin… I think that general architectural components like column store provide advantage for so many data warehouse workloads that you cannot pick a row store these days without missing out.

      There are some basic economics to reselling open source software. The obvious one is that R&D costs are reduced. But this is only the case if there is a community of developers doing the work. This is the case for Hadoop. Less so for Spark. Not much the case at all for Greenplum.

      The database world was pretty boring six years ago with DB2, SQL Server, Oracle, and Teradata dominating. We have moved to an interesting place in a very short period of time haven’t we?

      Like

  3. Rob,
    I also see cloud solutions upsetting traditional models e.g. AWS combined with Redshift and Azure and now PowerBI.

    Tom

    Like

  4. Exasol is by far the fastest analytics db system, see the TPC benchmarks. It easily outperforms Teradata or Netezza by factor 10 or more and many scenarios. It has the most attractive licensing model. For new OLAP projects I would only go for Exasol these days.
    I can also tell from my current and past project experience: Teradata is extremely expensive, but it is no longer good value for the money. There are other vendors that progressed much faster than Teradata (see e.g. Microsoft). Netezza is also expensive and a bit old school technology wise, but still offers relatively good performance similar to Teradata. SAP HANA like you wrote is totally overpriced and cannot even compete with the latest Netezza applicances performance-wise. DB2 even with BLU cannot compete with Netezza – queries run many times slower, especially complex queries involving many joins.

    Like

    • I appreciate your opinion here, Timm.

      I cannot find an architectural reason for your claims re: EXASOL… But the TPC benchmarks are solid for sure.

      I cannot agree with your suggestion that Netezza outperforms either BLU or Teradata… Columnar wins in my experience. In-memory wins. These are architectural advantages that Netezza cannot overcome.

      Rob

      Like

  5. Not sure I agree with your comment – “Netezza could not match the power of columnar and could not get there so they fell away”, since Netezza (now called IBM PureData System for Analytics) is still going strong.

    In fact as you stated in your 1Q2013 blog “The IBM acquisition has opened up a market of Blue shops to Netezza… so they are selling… and as a result Netezza is here to stay.”, that statement is still proving true today, even with the later announcement from IBM or DB2 BLU, and does not reflect your later prediction in your July 2013 blog predicting “the end of Netezza”.

    Like

    • Harry,

      Netezza no longer can win much in an open competition against any of the major data warehouse database players. There is a very small niche where zone maps can win against columnar… A very small niche headed towards non-existent.

      But it could be that the worst possible platform for a data mart or an EDW is DB2 on the mainframe.

      IBM recognized both of the facts and positioned Netezza as a bolt-on to z/OS. This provides Blue shops with a better alternative and makes Netezza a player only when folks are so closed to the technology that only an IBM solution is conceivable. Even then I think that BLU will turn out to be a better solution as it will offer columnar and in-memory technology.

      So while there is a market for Netezza in closed shops…. This blog is about architecture, not marketing. I stand by my previous comments and apologize if my previous statement about Netezza living on suggested that they were competitive.

      – Rob

      Like

  6. Pingback: Big Data News: Greenplum and Eagle… The latest on Big Data and Open Source | Sonra. Unleash the Value of your Data.

Comments are closed.