Cloudera and MongoDB: 'We're better together'

Near-term agenda includes better integration between the two big data systems

Two of the biggest players in big data -- Cloudera, a name-brand Hadoop vendor, and MongoDB, creator of the leading NoSQL database product -- have decided it's better to work together.

Billed in a press release as "the strategic partnership that will transform how organizations approach big data," the exact nature of the collaboration between the companies is still somewhat under wraps, but both outfits have spoken about the need to bring Hadoop and MongoDB closer in environments that demand it. In time, the collaboration might even become more than technical, but for now, both have hit upon an ideal starting point.

Currently, Cloudera's MongoDB and Hadoop cross-integrate to some degree, most prominently via a MongoDB connector provided with Cloudera 5. For both companies, creating a joint road map for better integration between the two systems is on the near-term agenda.

Right now, any data requests from MongoDB into Hadoop require a MapReduce job to be spun up before the data can be transferred. The idea -- as far as improving that connector is concerned -- is to rework the connector's operation, so the data can be pushed straight from MongoDB into HDFS in its native JSON format, processed natively within Cloudera, and pushed back into MongoDB if needed.

The use cases for MongoDB and Hadoop are radically dissimilar; after all, one is a free-form database and the other is primarily a distributed processing system for data. Given that, I asked what sorts of customers are interested in using the two products side by side -- in other words, the folks both Cloudera and MongoDB would work more closely with in the future.

Matt Asay, VP of marketing and business development and corporate strategy at MongoDB, replied by noting a number of business scenarios have workloads that need both applications. "Hadoop is good for analyzing the behavior of crowds or trends. MongoDB is good for interacting with individuals rather than the crowd, feeding information into Hadoop and then having Hadoop feed it back," he told me.

As an example, he cited an agricultural equipment company pulling in sensor data, feeding that into Hadoop for a long-running query, then asking whether to plant crops. "That's not something you can do well solely with either product," he pointed out.

This leads into another key element of the two companies working in concert: using their cumulative customer base to better educate people in how to apply both technologies as a combined asset. "At least as big as the technology is coaching and education of the market and the customers to learn how to use the technologies together," said Asay.

When I asked if there were plans for the two companies to offer joint support for both products to customers using them, the answer was a qualified no. Nothing of the kind is on the table now, but Asay remarked, "I wouldn't rule anything out." That said, an escalation model exists for joint customers, which could conceivably serve as a starting point for future work for both companies if they follow that path.

This story, "Cloudera and MongoDB: 'We're better together'" was originally published by InfoWorld.

Recommended
Join the discussion
Be the first to comment on this article. Our Commenting Policies
See more