|
|
Data storage has become cheap. Consequently, we’re storing tons of it:
In lockstep with the explosive growth of data are tools designed to facilitate data processing — one such tool is Apache’s Hadoop. Hadoop is essentially a mechanism for analyzing huge datasets, which do not necessarily need to be housed in a datastore. Hadoop abstracts MapReduce’s massive data-analysis engine, making it more accessible to developers. Hadoop scales out to myriad nodes and can handle all of the activity and coordination related to data sorting. Yahoo! and countless other organizations have found it an efficient mechanism for analyzing mountains of bits and bytes. Hadoop is also fairly easy to get working on a single node; all you need is some data to analyze and familiarity with Java code, including generics.
In IBM developerWorks‘ article “Big data analysis with Hadoop MapReduce” you’ll get started with Hadoop’s MapReduce programming model and learn how to use it to analyze data for both big and small business information needs. You’ll find that analyzing data with Hadoop is easy and efficient!
Looking to spin up Continuous Integration quickly? Check out www.ciinabox.com.