DevOps and DataOps have been the transforming force behind the big data boom, but like shoddy plumbing behind the wall, if it’s not well-designed it might block progress as it struggles to deliver what it was designed to.
Here’s a guide to the technologies and troubles behind the big data and analytics boom - with a focus on demystifying the topic so that the board can make the right decisions, whilst understanding key risks and benefits. Obtaining this understanding is particularly important as many turn to cloud provision to run their business programmes - often with little understanding of the use-case or real cost for the services that will be consumed.
Why DevOps became so crucial
The flow of data can be perilous. Any number of problems can develop during the transport of data from one system to another. Data flows can hit bottlenecks resulting in latency; resulting in the flow becoming corrupted, or datasets may conflict or duplicate unnecessarily. The more complex the environment and intricate the requirements, the more the potential for these problems increases.
Volume also increases the potential for problems. Transporting data between systems often requires several steps. With good reason, data teams are focusing on the end-to-end performance and reliability of their data pipelines. The rapid increase in data volume and variety has driven organisations to rethink enterprise infrastructures and focus on longer-term data growth, flexibility, and cost savings to manage their big data and analytics programmes. Through this process, enterprises are discovering that current, on-premise solutions are too complicated, inflexible, and are not delivering on expected value. In short: Data is not living up to its promise.
Managing big data on-premise is complex and requires expert technical talent to troubleshoot most problems. This expertise is costly, specialised, and the task at hand is time-consuming, complex, and painstaking. That’s why more enterprises are moving their data workloads to the cloud, but the migration process isn’t easy, as there’s little visibility into costs and configurations.
It’s clearly going to be hard for the board to understand and steer the process appropriately when it’s hard for the delivery team to fully understand what they are getting into; given complex SLAs and long migrations from on-premise data pipelines to those of cloud providers.
Data in the cloud
As an alternative to the on-premise complexity and sprawl, organisations are looking to cloud services like Azure, AWS, and Google Cloud to provide the flexibility to accommodate modern capacity requirements and elasticity. Unfortunately, they are often challenged by unexpected costs and a lack of data and insights to ensure a successful migration process. If left unaddressed, organisations will struggle with the complexity of these projects that don’t fulfill their expectations and frequently result in significant cost overruns.
The board needs to know a top-line around cloud migration assessments for their business.
The underlying details, such as of the source environment and applications running on it, and identifying workloads suitable for the cloud and a computation of anticipated hourly costs, are too fine-grained for a strategic audience. To set the scene for migration success, the board needs to understand a higher level set of business metrics, as well as broader insights that serve to indicate positive progress for these DataOps projects.
Setting the right mind-set for data delivery teams involves setting appropriate goals - and expectations on how to deliver against them. The board should ensure there is funding and authorisation to use the latest right-sized solutions. As an example, Application Performance Management (APM) can uplift a frazzled data delivery team from under-fire and outgunned to strategically managing the complexity of life data pipelines with precision and speed.
The modern applications that truly are driving the promise of data, delivering or helping a company make sense of the data are running on a complex stack, which is critical and crucial to delivering on the promise of data. And that means you need software to help this process and ensure systems are reliable, that you can troubleshoot them, and ensure their performance. APMs with machine learning make a huge difference, stripping out intensive human labour.
Directions from a well-connected board
Be well informed about the current clusters and usage to make an effective and informed decision about what is needed from a new cloud provider, or to effectively plan resources to manage what stays on-premise
Understand your critical application workloads that will benefit most from cloud-native capabilities, such as elastic scaling and decoupled storage - as these will drive your future business success
Ensure your delivery teams can define the optimal cloud topology that matches specific goals and your business strategy, minimising risks or costs
Make clear that your delivery team needs to be all over the hourly costs expected to incur when moving to the cloud, and that they should be able to provide a clear recommendation after comparing the costs for different cloud providers and services and for different goals, set by the board
Accept the need for optimised cloud storage tiering choices for hot, warm, and cold data and know how these choices might impact on strategic business operations.
Kunal Agarwal, co-founder and CEO, Unravel Data