Team of rivals: Hortonworks, Pivotal join up for Hadoop project

Apache Ambari integrates third-party components for Hadoop installation, provisioning, and monitoring

When it comes to the Hadoop data platform, Hortonworks and Pivotal could scarcely have more dissimilar approaches. The former prides itself on being a non-proprietary, pure open source product; the latter touts its utility and power as an enterprise data system.

But the two can agree on one point: the value of the underlying open source projects comprising Hadoop. To that end, Hortonworks and Pivotal are planning to collaborate on the Apache Ambari project to contribute a component that handles provisioning, monitoring, and management within Hadoop.

Ambari gets little press compared to some of Hadoop's other components, such as Apache Hive or Apache Spark. But with orchestration, devops, and management taking more of a front seat in the minds of IT managers and enterprise data wranglers, a project like Ambari will likely become pivotal (please pardon the pun).

When I spoke to Shaun Connolly, vice president of corporate strategy at Hortonworks, he provided some context for why his company and Pivotal are collaborating. "Ambari has been building momentum for more than 18 months now," he explained, "to the point where the folks at Pivotal, who have made their own operations tools, have been taking a good hard look at it and wondering if they can adopt it and invest in it."

Out of that came a mutual decision to work with Pivotal, where the two companies will develop a joint road map for how to contribute to Ambari.

Much of what motivated the decision to work on Ambari, rather than other Hadoop projects, was what Connolly described as a picture of the broader enterprise Hadoop stack. "It's not just the data management and data access pillars like HDFS and Yarn, and access engines like Hive, but also pillars like security and governance," he said. "If you look first and foremost at the area of ops, though, a lot of investment need to be made there to really hit the mark for mainstream enterprise deployment. A lot of the bang for the buck from Pivotal was to choose that as a first area for collaboration."

Connolly didn't believe, however, that an increased focus on Ambari would mean a de-emphasis for Pivotal on its proprietary enterprise-grade offerings. "We're looking for clean areas where we can both invest in enterprise Hadoop and jointly market to mutual customers to deploy at scale," he said. When it came to which project seemed to afford the best value for both companies, "Ambari fit the bill."

What makes Ambari special, especially as of its most recent releases, is how it provides ways to extend and expand upon the management framework it provides for Hadoop, says Connolly. Ambari "enabled a pluggable infrastructure where you can define external components in what Ambari calls stacks," he explained. "It not only deploys the standard enterprise Hadoop services, but integrates any third-party component into the installation, provisioning, and monitoring process."

This means it could serve as a connector back to products like Chef, Puppet, or Salt. "We have a lot of customers who use Ambari's REST APIs and use Chef, Puppet, or other devops tools to provision with them." For example, Ambari's blueprints feature allows the layout of a Hadoop cluster to be defined, exported, and reused. "With five or six API calls, you can lay down your machine using Chef or Puppet, and make five or six REST or API calls into Ambar to provision the Hadoop infrastructure with the right order of services. It fits pretty cleanly."

It's all but impossible to speak of orchestration or management these days without also mentioning Docker, so I asked if one of the possible future directions for their contributions to Ambari would involve that software-containerization system. Connolly agreed that Docker "will be relevant in not only Hadoop's space but the broader PaaS space. From a runtime perspective, it could be the substrate that helps glue the different perspectives together, whether in a data-centric view or an app runtime view. Things like operational deployment and security also belong in that domain.

"Stay tuned on the Docker front," he added. "There's a lot more interesting stuff to be done there."

This story, "Team of rivals: Hortonworks, Pivotal join up for Hadoop project" was originally published by InfoWorld.