Skip to main content

Microsoft plans to heavily support Apache Spark

Microsoft is preparing to increase its commitment to the open-source Apache Spark big-data processing engine this week at the Spark Summit in San Francisco.

At the summit, officials from Microsoft will be offering further insight into its support for Spark with the company's HDInsight, Cortana Intelligence Suite, Power BI and Microsoft R Server.

Later this summer, R Server for HDInsight will no longer be in public preview and will be generally available. It will include Spark integration for both the cloud and on-premise version of HDInsight.

Last April, Microsoft acquired Revolution Analytics for its R programming language that can be used for statistical computing and predictive analytics. Now Microsoft will be able to put R to good use in a number of its products. Previously the company had announced that the commercial R distribution would be integrated into its SQL Server 2016 and would be called SQL Server R services. On June 1st, Microsoft made its latest version of SQL Server generally available.

Also in June, Microsoft will release R Server for Hadoop on-premises that will offer support for Microsoft R as well as Spark's native execution frameworks. In a blog post on its site, Microsoft said: "Combining R Server with Spark gives users the ability to run R functions over thousands of Spark nodes letting you train your models on data 1000x larger and 100x faster than was possible with open source R and nearly 2x faster than Spark's own MLLib,"

Power BI will also be gaining additional features with new support for Spark Streaming scenarios. Even though Spark is often seen as a rival competitor to Hadoop, Microsoft has decided to position the two as harmonious in a number of cases.

Microsoft Research has also undertaken a new project called Prajna/OneNet which is aiming to build a distributed functional-programming platform that would enable those who wished to build cloud services that utilize big-data analytics in ways that Spark is already able to do.

We will likely hear even more details about the ways in which Microsoft will support Apache Spark over the course of June 6-8 during Spark Summit 2016.

Photo credit: StockStudio / Shutterstock

Anthony Spadafora
After living and working in South Korea for seven years, Anthony now resides in Houston, Texas where he writes about a variety of technology topics for ITProPortal.