In today’s digitally driven world, processing streaming data in real-time is a requirement for business success. In order to gain a competitive advantage, organisations must enable app developers to combine the power of complex event processing (CEP) with real-time analytics on streaming data. The result? Real-time intelligent decisions powered by machine learning empower organisations to glean turbo-charged insights, arming them with tools to succeed and thrive in the real-time economy.
To deliver applications that meet ever-evolving user experience demands, businesses are moving from post-event reconciliatory processing to in-event and in-transaction decisioning on streaming data. This is forcing technology teams to choose between streaming technologies with compromises typically featuring the need to stitch together multiple layers and manage multiple layers of resiliency. While each layer performs fast on its own, the overall architecture makes it difficult to meet stringent responsiveness SLAs for the business use case. As real-time use cases increasingly become the norm in verticals such as telecommunications, financial services, IoT, gaming, media, eCommerce and more, developers need to adopt new approaches. This will allow them to make the low latency complex decisions that drive business actions without compromising the performance and scale that is critical in the modern enterprise.
The introduction of 5G networks will only increase the data volume and speed requirements that are already putting pressure on traditional data architectures. Organisations need to ingest this unprecedented increase in data traffic, while also driving actions by making intelligent, dynamic decisions across multiple data streams. Though current data streaming architectures are usually sufficient to act as processing pipelines, they do not meet the needs of mission-critical
applications which are underscored by low latency and responsive multi-step decisions. In addition, with a projected increase in density of connected things per sq. Km (1 million per sq. km), and the prescribed low latency in single digit milliseconds, data and processing is going to be decentralised with several edge data centres, as opposed to the traditional few central hub data centres.
There is a confluence of incomplete information coming into play where traditional, and many contemporary choices for processing streaming data, are going to fail. For interactive low latency applications and streaming pipelines to coexist, they must use the same data to drive cross functional consistency.
The top four pieces of incomplete information are:
1. Microservices architecture mandates separation of state and logic. What’s missing is an understanding of the types of business logic and where what should exist. While the application flow control logic can stay in the application layer, thus making the compute containers truly stateless, the data-driven business logic must exist with the data.
2. Network bandwidth usage efficiency. When you have the state stored in a NoSQL data store and the container instance is going to have to move 10 to 25 kilobytes of data payload per interaction both ways (i.e. read the object from the store, modify it and send it back to the data store), the application quickly starts to consume high amounts of network bandwidth. In a virtualised or containerised world, network resources are like gold. One should not squander it for frivolous data movements.
3. Fundamental premise of stream processing. Stream processing today is based on one of the time windowing concepts:event time window or process time window. This is not truly representative of reality. Organisations need continuous processing of events as they arrive either individually or contextually. This approach will avoid problems like missed events because they arrived late, without having to bloat the database to wait for the late arriving known last event.
4. Multiple streams of data get cross-polled to build complex events that drive decisions. The event driven architecture is a stream of messages, each tied to an event driving some action. The challenge organisations face is building complex events from multiple streams of data, or a single stream of data driving changes to multiple state machines based on complex business logic.
A smart streaming architecture allows one to:
- Ingest incoming event data into a state machine
- Build a contextual entity state from multiple streams of ingestion
- Apply a ruleset of business rules to drive decisions
- Enhance and enrich these rules by incorporating new learnings derived from machine learning initiatives iteratively
- Let the decisions propagate to drive actions
- Migrate the contextually completed/processed data to an archival store once they are not needed in the real time processing
The Smart Stream Processing Architecture consists of one unified environment for ingestion, processing, and storage. This integrated approach with built-in intelligence performs the analysis right where the data resides. It utilises a blazing fast IMRDPP to not only make streaming “smart”, but to also provide linear scale, predictable low latency, strict ACID, and a much lower hardware footprint that can easily be deployed at the edge. With built-in analytical capabilities such as aggregations, filtering, sampling and correlation — along with stored procedures / embedded supervised and unsupervised Machine Learning — all the essentials of real-time decision-oriented stream processing are available in one integrated platform.
Dheeraj Remella, Chief Technologist, VoltDB
Image Credit: Supparsorn / Shutterstock