An Elastic Shared-Nothing Architecture

In this post we will consider again the implications of implementing a shared-nothing architecture in the cloud. That is, we will start wondering about how to extend a static shared-nothing cluster deployed into an elastic hardware environment.

This is the first of three posts inspired by a series of conversations with the folks at Bityota (Bityota.com). After seeing the topics they asked if they could use the content in their marketing… so to be transparent… this is sort of a commercial post… but as you will see there is no promotional foam in the narrative.

– Rob

There is an architectural mismatch between Cloud Computing and a shared-nothing architecture.

In the Cloud: compute, processors and memory, scale independently of storage, disk and I/O bandwidth. This independence allows for elasticity: more compute can be dynamically added with full access to data on a shared disk subsystem. Figure 1 shows this relationship and depicts the elasticity that makes the Cloud so compelling.

In a shared-nothing architecture, compute and storage scale together as shown in Figure 2. This tight connection ensures that I/O bandwidth, the key to read performance, is abundant. But, in the end scalability is more about scaling I/O than about scaling compute. And this fact is due to the imbalance Moore’s Law injects into computer architecture… compute performance has far outstripped I/O performance over the years creating an imbalance.

To solve for this imbalance database engineers have worked very hard to avoid I/O. They invented indexing and partitioning and compression and column-store all with the desire to avoid I/O. When they could not avoid I/O they worked hard to minimize the cost by pre-fetching data into memory and, once fetched, by keeping data in memory as long as possible.

For example, one powerful and little understood technique is called the data flow architecture. Simply put data flow moves rows through each step of a query execution plan and out without requiring intermediate I/O. The original developers of Postgres, Sybase, SQL Server, Teradata, DB2, and Oracle did not have enough memory available to flow rows through so they spill data to the storage layer in between each step in the plan. Figure 3 shows how classic databases spill and Figure 4 shows how a more modern data flow architecture operates.

Why is this relevant? In a classic RDBMS the amount of I/O bandwidth available per GB of data is static. You cannot add storage without redistributing the data. So even though your workload has peaks and valleys your database is bottlenecked by I/O and this cannot flex. In a modern RDBMS most of the work is performed in memory without intermediate I/O… and as we discussed, compute and memory can elastically flex in a Cloud.

Imagine an implementation as depicted in Figure 5. This architecture provides classic static shared-nothing I/O scalability to read data from disk. However, once the read is complete and a modern data flow takes over the compute and memory is managed by a scalable elastic layer. The result is an elastic shared-nothing architecture that is well suited for the cloud.

Figure 5, Flowing to a Separate Compute Node

In fact you can imagine how this architecture might mature over time. In early releases a deployment might look like Figure 5 where the advantage of the cloud is in devising a cost-effective flexible configuration. As the architecture matures you could imagine a cloud deployment such as in Figure 6 where the 1:1 connection between storage nodes and compute nodes is broken and compute can scale dynamically with the workload.

Figure 6. Elastic Compute on a Shared-nothing Architecture

Cloud changes everything and it will significantly change database systems architecture.

It is strange to say… but the torch that fires innovation has been passed from the major database vendors to a series of small start-ups. Innovation seems to occur exclusively in these small firms… with the only recent exception being the work done at SAP on HANA.

8 thoughts on “An Elastic Shared-Nothing Architecture”

oramoss says:

April 28, 2015 at 10:37 am

Interesting stuff and my current client is starting to think about this kind of thing for the next cycle of their DW platform.

Do you know if Snowflake is similar to this offering from BitYota?

Loading...
1. Dev says:
  
  May 1, 2015 at 9:35 am
  
  We agree with Jon and are glad to see another company follow the architectural principles as set out above. BitYota was designed from the ground up for the Cloud; we have overcome its constraints and developed new algos for delivering high-performance analytics. BitYota’s data warehouse is available at various price/performance configurations. Happy to chat more – please contact sales@bityota.com or keep in touch via our blog.
  
  Loading...
Pingback: A New Way of Thinking About EDW Federation | Database Fog Blog
JD Graas says:

April 29, 2015 at 11:23 am

I don’t see the issue with I/O. You can always add more I/O throughput to each shared-nothing instance. The issue that has required hard-wired connections in the past is the I/O between shared-nothing instances to facilitate joins and aggregations. The question I have it how does one scale up the interconnect throughput between shared-nothing instances in a Cloud implementation?

Loading...
1. Rob Klopp says:
  
  April 30, 2015 at 2:59 am
  
  Hi John,
  
  I’m suggesting that I/O is an issue only because in a classic shared-nothing configuration you have to add I/O and storage in order to add compute. If you don’t need I/O or storage (which you don’t as memory sizes increase) then this is inefficient.
  
  I love the question… and I do not have a good answer. Clearly a cloud provider could provide 10GigE or Infiniband interconnects… but I don’t believe that is commonly available. Until the network infrastructure upgrades shared-nothing in the cloud will be problematic.
  
  Rob
  
  Loading...
Jon says:

April 29, 2015 at 4:53 pm

oramoss,

(Full disclosure: I work for Snowflake) Snowflake has the same architectural insight, that the traditional DW architecture isn’t optimal for cloud. Snowflake’s architectural solution also separates compute from storage, but also goes one step further in that it also separates metadata and coordination as well, which has implications for scaling concurrency. Would love to discuss in more detail to see how you’d compare the two.

Loading...
1. Rob Klopp says:
  
  April 30, 2015 at 3:01 am
  
  Hi Jon,
  
  I’ve heard nice things about Snowflake. Readers should have a look.
  
  Rob
  
  Loading...
2. oramoss says:
  
  April 30, 2015 at 6:36 am
  
  Thanks Jon
  
  I have already connected with your colleague Thomas Molloy.
  
  BitYota seemed like a similar proposition to Snowflake and I was just trying to understand if that was a reasonable view, or whether I was misunderstanding and they were radically different.
  
  Full disclosure…I’m an Oracle guy, previously an employee, now an independent consultant. 🙂
  
  Loading...

Comments are closed.

Share this:

Like this:

8 thoughts on “An Elastic Shared-Nothing Architecture”

Discover more from Database Fog Blog