Multi-cloud strategies are growing in popularity. IBM reported that by 2021, 98 percent of companies plan to use multiple hybrid clouds. When it comes to data warehousing, sometimes two (or more) clouds are better than one, and many organizations agree. Multi-cloud could mean a mix of public and private cloud infrastructures. It could also mean using different cloud data warehouse (CDW) providers, such as Amazon Redshift and Snowflake. It might mean hosting operational data stores in AWS, but transferring and performing analytics on that data in Azure. Or the underlying cloud platforms might differ, such as two different Snowflake instances, one that runs on Google Cloud Platform and a second on Microsoft Azure. These different CDWs could even be hosted in different regions. Sometimes multi-cloud is all of these at once.
It might still be early days for multi-cloud strategies especially for enterprises coming to grips with moving on-premises workloads to the cloud. There will likely be growing resistance by companies to services that lock them in to a single cloud vendor. Expect a lot more movement in this area as more companies demand that clouds work better together.
Why an enterprise would employ a multi-cloud infrastructure
There are several reasons that a company would go multi-cloud, including cost savings, the adoption of different grassroots technology in different departments, increased use of “lakehouses” (data lakes that include cloud data warehouse technologies), and preferred cloud partners. These are common use cases that lead enterprises to design and implement multi-cloud infrastructures:
- Technology consolidation: As new cloud data warehouses are spinning up on different platforms, companies have more choices. Preferred platforms introduce additional warehouse preferences and companies may spin up new environments and then use these alongside existing CDWs, or use multi-cloud for a period of time to foster a smooth transition from one cloud environment to another.
- Data and disaster recovery: Organizations are taking advantage of multiple cloud platforms, data lakes and cloud data warehouses to back up their data for peace of mind. Having a separate system with a copy of data is great protection against cloud outages, disasters, or any other unexpected downtime.
- Regional requirements: Cloud providers offer a number of regional data centers that can be leveraged to meet regional compliance and sovereignty requirements when it comes to business data. There are also benefits in choosing a cloud provider based on regional strength and ability to minimize latency.
- Varying teams and data needs: Some companies will choose to invest in different platforms because teams affinities for underlying technologies vary. This allows users to take advantage of a service only available on a particular platform. For example, using Sagemaker in AWS but Snowflake on Azure, or Google ML with Snowflake on GCP. By enabling each division with the technology they are comfortable with, experienced in, and that supports their needs, companies can gain efficiencies.
- Diversification and avoiding vendor lock-in: Organizations may want to avoid vendor lock-in. For example, with platform diversification, organizations have a greater degree of flexibility in case pricing, storage or compute offerings change.
Challenges of a multi-cloud environment
Multi-cloud infrastructure, like any technology strategy, comes with significant benefits, but also has its risks and challenges. These include:
- Data silos: Innately, a multi-cloud design creates data silos by allowing for data to be stored in different warehouses across different platforms in different locations. While these data silos are unintentional, they can become massive blockers to creating a single source of the truth. As individuals attempt to apply their own business rules, inconsistencies arise in their application of solutions meaning that outputs can differ. This prevents organizations from gaining the knowledge necessary to make data-driven decisions that deliver a competitive advantage.
- Data portability: Data silos are hard to break down because organizations can’t move data that is in different formats and resides in different technologies. Current portability solutions are expensive to obtain and maintain and the lack of portability that results can be a risk to a multi-cloud strategy.
- Data security: Data silos and lack of portability endure because moving data from one platform to another – or from one region to another – can also pose a data security risk without proper governance and security controls. Companies need a way to make the most of multi-cloud offerings within an optimal structure that also allows for the secure global movement of data.
How to solve these challenges
Rest assured, there are ways to safeguard against these risks. Different multi-cloud strategies present different options and opportunities for data accessibility, portability, and security. One solution is ‘cross-cloud’ data sharing. This method uses a unified data management layer - the same type of cloud data warehouse, which can operate on various cloud data platforms. For example, Snowflake customers can launch the Snowflake CDW on Amazon Web Services, Google Cloud Platform, and Microsoft Azure.
The main benefit for enterprises is choice and being able to take advantage of the best features of the platform that match a use case. For example, Google BigQuery charges when data is read. So if teams are doing large reads over data for aggregation over and over, it is likely that Snowflake may work out as the better cost-savings option in this scenario.
How to work with multi-cloud environments
A multi-cloud approach provides all the advantages of cloud without many of the pitfalls. There is danger in being limited to a single cloud vendor and its ecosystem, particularly for companies that want to lead by innovation, where the accelerating pace of technical improvement continues across major cloud vendors. Maintaining the flexibility to work on the best cloud platform to solve a particular business problem or process gives companies a competitive edge.
Here are some ways that enterprises can control multi-cloud environments:
- Implement tooling to track usage across clouds for budgeting and resource allocations and latency to identify architectural pain points that may need remediation.
- Choose tools that are built specifically for - and also enhance - the major cloud platform to ensure you have the right strategies aligned to the right platform.
- Understand which tools work best in which cloud environment, and find solutions that are purpose-built for cloud data warehouses to maximize ROI. Different cloud providers do certain things better.
- Select a solution that extends the cloud object store to multiple clouds, as a default multi-cloud deployment tier, to allow for the greatest degree of flexibility.
- Evaluate independent software vendors’ offerings that extend the capabilities and scope of what is available in a native cloud service provider offering.
While a multi-cloud strategy may seem overwhelming and challenging to manage at first, it ultimately offers the best possible scenario for business continuity. Having the right tools in place ensures manageability while empowering an enterprise to lead in innovation - and realize the best possible return on investment.
Ed Thompson, CTO and co-founder, Matillion