I was recently surprised to hear from a prospect that Teradata’s memory management was considered a differentiator over Greenplum. This is not because of bad information… Teradata probably does have better memory management… but they have better memory management because they use memory less efficiently than Greenplum. Let me explain…
First let’s be clear… the utilization of memory is not measured by the amount of memory required… but by the amount required times the amount of time it is required. Think about it… if you have a query that requires 16MB of memory and holds it for 1 second… and another query that requires 4MB of memory and holds it for 4 seconds… the effective memory utilization is the same.
Greenplum uses pipelining to flow data from step to step in the query plan. Teradata writes the results from each step in the query plan to disk, to a spool file. This architectural difference allows Greenplum to complete any single query in a small fraction of the time that Teradata requires… as Teradata pays the cost of writing results after each step and of reading these results into the subsequent steps add up. The result is that Teradata uses a smaller memory footprint for each query… but holds the memory significantly longer… resulting in relatively poor memory utilization.
Note that this is not really bad design by Teradata… it is just old design. Once upon a time the servers Teradata ran on had only a little memory… as little as 32MB… so they had to spool data to disk to make it all fit. Greenplum is designed for modern processors with 100X more memory… and we use that memory effectively to get queries in and out as fast as possible.
By the way, as a side effect Greenplum does not require management of spool space… so this sysadmin task is eliminated.
So… Teradata does tightly manage memory… but this is not an advantage… they manage it tightly because they have to.