Q&A: The data warehouse

What is a data warehouse, and why should you care? 

The basic idea behind a data warehouse is to provide a single place to store all of your data so you can actually analyse all of it. No matter what size your business is or how much data you have, data warehouses should consolidate data so that it can be loaded, transformed and accessed into actionable insights that drive business decisions.

Businesses collect data about everything—customer behaviour, business transactions, website and mobile application activity and social media. And that data holds valuable insights that must be revealed to make data-driven business decisions. However, most companies struggle to extract value from their data. According to Forrester, approximately three-quarters of companies aspire to be data-driven, but only 29 per cent turn their data into action. That raises the question, what’s so hard about accessing and analysing your own data?

The biggest challenges to becoming a true, data-driven organisation are the complicated analytics tools they must depend on, and the reality that their data exists in silos that are difficult to combine into a single location. Unfortunately, many organisations rely on data solutions designed before the cloud and therefore they don’t have the scalability and performance required for the job. Many big data tools also require specific skill sets that are in scarce supply.

How has the cloud redefined the face of data warehousing?

These days, having a data warehouse isn’t enough. Anything built before the cloud pales in comparison to what’s possible. Legacy, on-premises systems are too complex, too limiting and too expensive to maintain. And “cloud-washed” versions of these solutions suffer the same fate.

Fundamentally, a data warehouse for the cloud must be built from the ground up. Only then can they offer the flexibility, scalability and efficiency needed to quickly and easily get all the insight from all your data.

What are the opportunities provided through the cloud data warehouse?

Data warehouses built for the cloud deliver a single source of truth without limits. Ideally, they provide a complete SQL database so there’s built-in support for the tools that business users already work with. In addition, IT groups are able to focus on strategic initiatives. There’s no infrastructure to tweak and no tuning required. In addition, with a cloud data warehouse you have immediate access to unlimited resources, whenever you need them, to scale data, processing and concurrency.

It’s pretty transformative when you recognise what that means: any and all users can access data at the same time without performance degradation. You can even load data and perform dev/test operations at the same time without impacting the query performance of your business users. This idea of an effectively unlimited data warehouse can be transformative to business and it is only possible because of the cloud.

And, organisations benefit from one of the best features the cloud offers: you only pay for what you use. With transparent and automatic scaling, data warehouses built for the cloud are fully elastic and cost effective, so you can scale both compute and storage up and down, keep them separate from each other and do so all based on need.

What are some of the key aspects/criteria when selecting a modern data warehouse?

If you want a modern solution, the only option is a data warehouse that is built specifically for the cloud. There are five criteria to consider when selecting a cloud data warehouse:

1.      Full relational database with SQL support. SQL is ideally suited for data analysis. The industry has decades of experience with SQL, millions of users are SQL trained and SQL provides compatibility with existing BI and ETL solutions.

2.      Zero maintenance. A modern data warehouse does not have complex knobs that require tweaking for best performance. It just works. This allows database analysts to focus on data modelling, reporting, and business process.

3.      Support for all of your data. Using formats such as JSON, today’s business systems create vast amounts of structured, machine generated and semi-structured data. There is huge value in that semi-structured data and modern, cloud data warehouses should fully support this. You should be able to cost-effectively store all of your data in a cloud data warehouse, regardless of size.

4.      All of your users. Built for the cloud data warehouses can support an effectively unlimited number of users concurrently. Data loading should not impact queries so you can load data in near real time.

5.      Pay only for what you use. A data warehouse that is built for the cloud works the way the cloud should - fully elastic. You only pay for the storage and compute that you use, not a fixed monthly fee based on a cluster size.

The bottom line is that a data warehouse that is built for the cloud can leverage your existing skillset, deliver huge business value and lower your costs. It removes the challenges of working with data and allows you to focus on your business.

Data is increasingly seen as a valuable commodity. What could this mean for a business with large volumes of data?

Lots of data means lots of informed decisions. Businesses are faced with questions every day that, without data, are often answered by using a combination of experience, opinion and gut reaction. When you have large volumes of data that can be queried and utilised in the right way, all business decisions are based on data-driven insight. This includes traditional business intelligence and predictive and prescriptive analytics. The whole focus of business conversations changes for the better.

Businesses that adopt data warehouses built for the cloud have a competitive advantage because they can be more nimble and responsive to serving their customers. It also fosters streamlined operations, setting the stage for businesses to begin leading their industries.

Security is always a concern when data is stored off-premises. Are there any risks in a cloud warehouse?

Well-designed cloud systems are built secure by design. In fact, they have proven to be more secure than on-premises storage. Think about it. What’s more secure? An organisation with hundreds of applications (many designed before security was a big concern) or a cloud vendor that specialises in protecting the data of thousands of customers?

A cloud data warehouse should have all-encompassing security measures (including end-to-end encryption for data in transit and at rest) controlled by customer-managed encryption keys. They should have multi-factor authentication and role-based access control with granular privileges on all objects and actions. And lastly, all of a technology vendor’s security and security process should be independently verified through compliance standards such as HIPAA and PCI.

What changes do you expect to see in the industry over the next five years?

A new class of SaaS applications will be developed that use data warehousing built for the cloud. Because businesses will be able to access and use all of their historical data, they will start building solutions that are truly informed by that data. Applications will become smarter and more aligned with solving problems that precisely match consumer needs. On top of this, it’s possible for the reinvention of a whole variety of systems such as ERP and billing systems that never had the influence of historical data.

It’s an exciting time for data warehousing built for the cloud. But, it’s an even more exciting time for businesses to finally use their data the way they’ve always wanted to.

Bob Muglia, CEO, Snowflake computing
Image Credit: Flickr / janneke staaks