Recommended: Sing it, brah! 5 fabulous songs for developers
JW's Top 5
Optimize with a SATA RAID Storage Solution
Range of capacities as low as $1250 per TB. Ideal if you currently rely on servers/disks/JBODs
Storm is a big data framework that is similar to Hadoop but fine-tuned to handle unbounded data streams. In this installment of Open source Java projects, learn how Storm builds on the lessons and success of Hadoop to deliver massive amounts of data in realtime, then dive into Storm's API with a small demonstration app.
When I wrote my recent Open source Java projects introduction to Github I noted that Storm, a project submitted by developer Nathan Marz, was among GitHub's most watched Java repositories. Curious, I decided to learn more about Storm and why it was causing a stir in the GitHub Java developer community.
What I found is that Storm is a big data processing system similar to Hadoop in its basic technology architecture, but tuned for a different set of use cases. Whereas Hadoop targets batch processing, Storm is an always-active service that receives and processes unbound streams of data. Like Hadoop, Storm is a distributed system that offers massive scalability for applications that store and manipulate big data. Unlike Hadoop, it delivers that data instantaneously, in realtime.
Storm is a free and open source project. It is hosted on GitHub and available under the Eclipse Public License for use in both open source and proprietary software.
In this installment of the Open source Java projects series I introduce Storm. We'll start with an overview of Storm's architecture and use case scenarios. Then I'll walk through a demonstration of setting up a Storm development environment and building a simple application whose goal is to process prime numbers in realtime. You'll learn a bit about Storm and get your hands into its code, and you'll also get a little taste of its speed and versatility. If Storm is applicable to your application needs, you'll be ready for next steps after reading this article.
Storm is a free and open source distributed real-time computation system that can be used with any programming language. It is written primarily in Clojure and supports Java by default. In terms of categorical use cases, Storm is especially well suited to any of the following:
Perusing the Storm User group discussion forum, I found that Storm has been used in some very interesting real-world scenarios. Given a large database for an online auction site, for instance, Storm was used to view the top number of trending items in any category, in realtime. It has also been connected to the Twitter firehose and used to identify posting trends. Other potential uses for Storm include the following:
See the Storm user group discussion forum for more real-world and speculative use cases for Storm.
Recent articles in the Open source Java projects series:
More about Storm: