There are so many things in the Internet of Things (IoT) that might record data into your data fabric that a new approach may be required. Let’s think about this… define some terms, and see how these terms fit into current data fabric thinking, let’s consider how they fit into a more modern logical data warehouse architecture, and let’s think about whether the IoT might push us to a different approach.
I’m not going to go overboard on terms here… But we do need to distinguish between a sensor and a processor.
To my way of thinking a sensor is a thing. It creates, but does not necessarily process, data. A sensor has some means to communicate with a processor… but if there is no significant processing on the sensor other than communications then we will suggest that there is no “processor” in a meaningful sense. Let me give you four examples:
- The first is courtesy of Ray Carnes, a chief architect at Boeing. Imagine a brake-pad in your car with 100,000 dust-sized RFID sensors randomly scattered as part of the pad. These sensors do nothing but signal on an interval that they are present. This allows a processor elsewhere to record the signals and determine how much of the brake pad has worn. If only 80,000 sensors report we can assume that 20% of the pad has worn away.
- A Nest thermostat senses movement and temperature. It uses a network-connect to send the results of this sensing to the Nest mother-ship and performs little-or-no processing on site.
- Sensors in my Audi detect rotation of the wheels. There is a network that sends the results to a small embedded anti-lock braking processor that monitors all four wheels as well as the pressure on the brake-pedal and sends signals to all five components to allow the car to brake evenly.
- There is a sensor in the screen on the ATM I used yesterday that detects that I want to request service. This user interface communicates with a powerful general processor which then communicates with the Bank mother-ship to create and process banking transactions.
This last bullet is important… any device that takes user input is a sensor with an embedded processor. It is a “thing” just like the Nest thing. Today we tend to blur the line between sensor and processor as every thing has a powerful processor onboard. The IoT will change this assumption.
A processor then, is a computer that performs some analysis on the data generated by one or more sensors. A processor may also store data… a sensor will not.
Now let’s think about how we might combine sensors and processors in an architecture. To start lets consider the context of the data the processor can use for analysis:
- If the processor has only the last data sensed we would say that the context is immediate and local to one sensor. The processor can see streamed data but can only operate on the last event. We would say that this sensor-processor configuration can provide a simple reflexive response. When you press the lock button in your car a sensor detects this event and signals to all four doors and the boot to lock it up.
- Another configuration might allow the local processor to store more context from a single sensor over a longer period of time… so the context is historical and local. In the case of the anti-lock brakes… the processor receives signals from a group of sensors and stores a very short historical context. This grouped historical context is very powerful…
- Another configuration might store the group context and then forward the event details to a bigger server that stores and analyzes a universal context of all things to look for patterns. Further, there could be a hierarchy of groups leading to a universal context.
- Finally, a server with some group context could summarize the details for that group and pass on only a summary over time up to another group server or to a universal server.
I suspect that you can see where I’m going. There is a trade-off is this picture between the advantages of pushing analytic processing close to the sensor and the associated requirement for more analytic processors, the advantages of intermediate analysis requiring more data movement but fewer analytic processors, and the advantage of a central analytic mother ship where all data is stored and analyzed. In the next version of this thread I’ll try to tease apart the trade-offs.