Optimize with a SATA RAID Storage Solution
Range of capacities as low as $1250 per TB. Ideal if you currently rely on servers/disks/JBODs
Intel has released its own Hadoop distribution in a move intended to accelerate adoption of the big data platform while ensuring more of those workloads run on Intel's own Xeon processors.
The Intel Distribution for Apache Hadoop includes core pieces of the data analysis platform that Intel is releasing as open-source software, as well as deployment and tuning tools that Intel developed itself and which are not open source.
According to an InfoWorld report, Hadoop will be in two-thirds of advanced analytics products by 2015. Get a beginner's introduction to MapReduce programming with Hadoop, then find out how Twitter programmers use Hadoop. Get JavaWorld's Enterprise Java newsletter delivered to your inbox.
Organizations will be more willing to expand their investments in Hadoop if they know there's a consistent distribution backed by a big, stable vendor like Intel, said Boyd Davis, general manager of Intel's data center software division, at a launch event in San Francisco Tuesday.
Intel has been upping its investments in software for several years, to help ensure its processors are widely used beyond their traditional stronghold in client/server computing. It said it has worked with customers over the past few years to develop its Hadoop distribution, and that this is actually its third release of the software.
Still, it's a significant announcement that moves Intel deeper into the software industry. Like many other open-source providers, Intel will now sell support and maintenance services for its distribution, Boyd said.
Hadoop includes a dozen or so open-source projects that work together to make it easier for users to store, manage and analyze large amounts of data. It's become the go-to software platform for companies mining Web logs, transaction histories and other data in search of added value.
Intel's distribution includes versions of the Hadoop Distributed File System, the Hadoop Processing Framework, Hive and Hbase. Intel has tweaked those programs to take advantage of capabilities in its own Xeon chips, such as its processor instructions for accelerating AES encryption.
"By incorporating silicon-based encryption support of the Hadoop Distributed File System, organizations can now more securely analyze their data sets without compromising performance," it said in a news release.
But Intel says the core components of its distribution remain open and compatible with other implementations of Hadoop. If customers choose Intel's distribution, "they're not getting locked into a technology," Boyd said.
At the same time, Intel has developed some of its own tools that will not be released as open source. They include a deployment and configuration tool called Intel Manager for Apache Hadoop, and a tool for tuning cluster performance, called Active Tuner for Apache Hadoop.
Customers who run Intel's Hadoop distribution on servers loaded with Intel hardware, including its processors, solid-state drives and 10 Gigabit Ethernet cards, will see a 40 percent performance boost over users who don't go with an all-Intel platform, according to Boyd.