AWS Lake Formation makes it easier to build data lakes

(Image credit: Image Credit: Flickr / janneke staaks)

Amazon has announced that its data lake management tool AWS Lake Formation is now live. The tool is a fully managed service that helps business build, secure and manage data lakes. For starters, it will be generally available in Ohio (US East), North Virginia (US East), Oregon (US West), Tokyo (Asia Pacific) and Ireland (Europe).

Lake Formation was first announced late last year at Amazon’s AWS re:Invent conference in Las Vegas. Its goal is to automate a handful of tasks needed when building a data lake, which includes collecting, cleaning, deduplicating and cataloguing data. It is also designed to make the data available to analytics software.

It is also designed as a centralised dashboard, which administrators can use to manage data access, governance and audit. Going forward, Amazon plans on enabling admins to analyse data within data sets using their preferred AWS analytics and machine learning services.

We can expect Redshift, Athena, Glue, EMR, QuickSight and SageMaker to begin with.

“Our customers tell us that Amazon S3 is the ideal place to house their data lakes, which is why AWS hosts more data lakes than anyone else — with tens of thousands and growing every day. They’ve also told us that they want it to be easier and faster to set up and manage their data lakes,” said vice president of databases, analytics, and machine learning at AWS Raju Gulabani. “

That’s why we built AWS Lake Formation, so customers can spend more time learning from their data and innovating, rather than wrestling that data into functioning data lakes. We’re excited to see how customers use it as one of the building blocks for growing and transforming their businesses and customer experiences.”

Worldwide, the global data lakes market should hit $12.01 billion within the next five years.