Data lakes are becoming an essential big data storage tool as enterprises amass mountains of data from M2M connections, social networks, and remote workforces.
When demystifying data lakes, it is important to understand the hurdles, separate the hype and consider the opportunities, especially when combined with the agility, capacity and flexibility of cloud computing.
As discussed here, understanding how to best mine a growing lake of data to drive business innovation and growth is a key strategy that must occur sooner rather than later within every enterprise.
What is a Data Lake?
A data lake is a large repository that holds data in native form until queried. The implementation of a data lake can facilitate real-time business intelligence systems, improve customer experiences, and accelerate growth and business performance.
According to Gartner, “By 2020, information will be used to reinvent, digitalise or eliminate 80 per cent of business processes and products from a decade earlier.” As opposed to legacy storage solutions where data lives in silos, data lakes allow raw, unvarnished bytes of information to live in one place where they can be integrated and analysed for patterns.
Data Lakes Can Become Data Graveyards
Some of the main impediments associated with enterprise data strategies are: failing to collect data, using data in silos, or collecting massive amounts of data without the technology and expertise to effectively use it.
The result: a data graveyard. Gartner additionally warns that IT leaders should exercise caution and understand that a data lake can easily become a data graveyard if not managed correctly.
Data presents a great opportunity for enterprises to get the right answers by making the right queries at the outset. Thus, it is essential that an enterprise’s data strategy is sophisticated, multi-faceted and built on a secure foundation that can positively impact business performance.
Simply put, data has the power to transform business results. As outlined in Verizon’s “State of the Market: Internet of Things 2015,” data helps enterprises:
- Control and React by allowing for greater automation, remote control, analysis and reporting.
- Transform and Explore by supporting entirely new business models, products and services, and data economies.
- Connect and Monitor by connecting small amounts of data to enable manual monitoring as part of a single organisational process.
- Predict and Adapt by using complex predictive analysis for pre-emptive action to remain agile.
Data Lakes and Cloud Computing
The most successful data lake strategy will include cloud computing due to its unrestricted, flexible, and cost-effective store and compute capabilities.
Cloud-based solutions can easily house massive amounts of raw data but also be scaled down during times when enterprises produce less data – like immediately following the Holiday rush for retailers. They also nimbly deliver bursting power to crunch data when - and only when - it’s needed. Cloud-based data lakes provide:
- Increased Agility: Cloud provides for complex predictive analysis and is adept at spinning up and spinning down to run queries.
- Built-in Security: Robust security should be built into a big data strategy from the start of the initiative to keep infrastructure and customers’ information safe.
- Massive Scalability: As data increases in size and complexity, infrastructure that includes flexible networking to securely move data when it needs to be moved must be able to support this.
It is clear that big data can yield big results. Enterprises can use data lakes to derive useful information to understand the world in which they operate, and apply that knowledge to gain a competitive advantage.
However, without the right data storage, analysis and execution strategy, data is really just information. It needs to be put into action before it becomes business intelligence that can positively impact the company.
Gavan Egan, Managing Director Cloud, Verizon
Image source: Shuttterstock/Bruce Rolff