Every successful company collects data – it’s what creates real value for a business once it is transformed into information that can be used to drive decisions. The value of this data is dependent on the software used to process it. This is what can truly elevate the data a business collects.
As the digital transformation of companies across all sectors continues, software is becoming even more important. More enterprises are beginning to see if data can be used rapidly by their software & analytics teams and business intelligence can be derived from it, and whether they can deliver more value to customers.
For example - even a tractor manufacturing organisation has a huge software organisation because they need to analyse data coming from millions of machineries, find patterns, deliver proactive service, create competitive edge and gain more market share. Processing data efficiently has risen up the agenda of businesses, even for those where software development isn’t a focus. As a ‘one size fits all’ solution doesn’t exist, increasingly companies are now practicing software development in-house as a competitive advantage.
Let me paint the picture. Until recently, software development and data management was seen as an IT issue. Today, it is a business critical task, mainly due to digitisation of most companies. If software and data is so critical, there exists a need for the correct software and data management to be implemented across an organisation to ensure processes are in place to ensure the company remains competitive.
An increasing amount of companies are beginning to realise that the development and integration of new applications has to become more efficient. It’s not just the technology side that matters though. A cultural shift is required too, moving away from the “as we have always done it” mind-set. A practical approach that exactly challenges this rethinking is DevOps.
Challenges with conventional practices
So, what exactly is DevOps? DevOps forges a co-operation between the development team (Dev) and IT operations (Ops). The DevOps approach strengthens communication, collaboration, integration and automation.
Let’s go back to the example of tractor manufacturing organisation mentioned earlier. Assume that all data coming in from millions of machineries is stored in a 50 TB database. There are typically multiple developers and analysts who need to get access to that data. And these developers and analysts may even need to modify the data to do what-if analysis. However, modification of data may not be desirable for everyone.
So the developers need separate copies of the 50 TB dataset. However, creating 10 to 15 physical copies of such a large 50 TB dataset is very time consuming and also needs a lot of storage, making it very costly. This is the first friction point between Developers who want quick and instant access to data, and Operations who just can’t provide so many copies of large datasets so quickly. As a result, developers and analysts get sequential access to data and this slows down the release cycles for the next feature and business analytics.
In some organisations, the Dev and Ops team come to a compromise where Ops will subset the data, say from 50TB to 1TB and provide multiple copies of 1TB data sets to save time and storage. But the challenge with this approach is that developers don’t find all the defects early in development cycle and find defects very late in the release process. This leaves the team with two options: either ship the feature with defects or delay the release, both of which are undesirable.
Most organisations want to control which developer or analyst should get access to copies of production data, and also want to mask sensitive data. There is no easy way to automate all of this with conventional tools and manual processes. These challenges are amplified when the culture of DevOps does not exist in the organisation.
A contemporary approach is required
What is needed is a contemporary solution for data management that is designed to modernise processes and increase efficiency. A solution that speeds up the provisioning of data and reduces the dependence of the Dev team from the Ops team. This solution is Copy Data Virtualisation.
Copy Data Virtualisation is based on a single "golden" physical master copy of production data from which an unlimited number of virtual copies can be provisioned immediately. Thus in the example mentioned previously 20 virtual copies of 50 TB dataset can be provisioned within a few minutes in a self-service manner. And since they are virtual copies, no extra storage is needed. The virtual copies are re-writable and thus extra storage is needed only for the incremental writes that happen on virtual copies.
These virtual data copies are then available for a variety of use cases – analytics, what-if analysis, unit testing, integration testing, QA testing, UAT testing, production support testing etc. It’s important to note that provisioning multiple virtual copies parallelises testing and hence help reduce release cycles and data analytics time. Data virtualisation technology is also useful for more use cases such as backup & disaster recovery. With a single platform to manage copies of data for various use cases, it simplifies the overall data management for enterprises anywhere in private – public – hybrid cloud.
DevOps teams can easily integrate data virtualisation into their existing mix of DevOps tools. Consider this end-to-end flow in a Continuous Integration process. When software developers write code, they need to do unit testing. They can provision virtual data for testing using APIs. Once they have tested their latest software against the virtual data, they check-in the code. And during the window in which new build is being created, an orchestration tool like Ansible / Chef / Puppet can provision compute for integration testing. A tool like Jenkins can invoke data virtualisation APIs to provision virtual datasets on those test machines. Then Jenkins can deploy the latest build on those test machines. Then the QA automation test framework can be invoked to run 1000s of test cases parallel on multiple test machines against virtual copies of production data.
Copy Data Virtualisation makes it easy for DevOps to provision copies of production data into their existing DevOps tools ecosystem by invoking just a few APIs.
Data and software = key
According to the DevOps model, integrating application development with IT operations requires rethinking. Silos and separate processes are removed and application development is accelerated. Sleek data management, thanks to Copy Data Virtualisation, fits perfectly into this concept, as information silos are eliminated in favour of simple and rapid data access.
By adopting Copy Data Virtualisation, organisations are able to experience the full potential of DevOps. An efficient combination of business-driven rapid application development with high quality and fast, smooth business processes is realised. The key element here is intelligently managed virtual data. This will be even more crucial in the future because data and software are becoming increasingly more important as the underlying infrastructure is becoming commoditised.
DevOps is becoming a key part of IT planning for organisations globally, and if it hasn’t been employed yet, it will certainly be considered. What's clear is that when combined with Copy Data Virtualisation, it can be an incredibly powerful step towards changing the world of business when gaining a competitive advantage is critical.
Ash Ashutosh, CEO, Actifio
Image source: Shutterstock/niroworld