Apache Hadoop has become the go-to solution for enterprises looking to effectively store, manage and process massive volumes of both structured and unstructured data. As data continues to accumulate at staggering rates, enterprise IT can turn to cloud-based solutions to meet increasing data storage needs.
Hadoop has joined forces with the cloud, with it being available on the virtual machines of Google Compute Engine and Amazon Web Services, among others. As powerful as Hadoop already is, the synergistic union between Hadoop and cloud-based platforms has taken Hadoop to new heights in the following ways.
Hadoop's scalability has always been a benefit to enterprises with large data storage, management and analytics needs. However, the number of commodity servers at hand limits scalability in a physical environment. Adding virtual servers to the equation makes scalability elastic and unlimited. With the ability to rapidly scale resources up or down according to data demands, enterprises have all the resources they need while paying only for the resources they use.
In a cloud environment, deployment is faster, owing to the fact that hundreds and even thousands of virtual servers can be spun up in mere minutes.
Although Hadoop is already designed to run on low cost commodity servers, the costs of procuring, installing, and maintaining large numbers of servers in an appropriately-sized physical data centre can be considerable. Additional costs are incurred as physical servers are switched out for newer versions – something that needs to occur every few years to ensure top performance. Virtualised Hadoop means that the cloud vendors shoulder the costs of upgrading and maintaining servers. Thus, enterprises reap the benefits of having access to a highly distributed, full-featured analytics platform without incurring any of the costs associated with traditional infrastructure.
Partnered with a cloud-based platform, Hadoop can leverage high availability architecture to mitigate downtime and optimise user service levels. Additionally, a virtualised environment allows for the pooling and sharing of resources and the balanced distribution of workload across multiple applications - the net benefit of which is greater efficiency.
As data continues to grow at exponential rates, the challenge for enterprises is to more effectively mine mountainous volumes of information to obtain valuable insights essential for gaining competitive advantage and maximising ROI. Hadoop in the cloud is a powerful and affordable solution destined to take enterprises to new heights of performance and success.
Michele Nemschoff is vice president of corporate marketing at MapR Technologies.