Recommended: Sing it, brah! 5 fabulous songs for developers
JW's Top 5
Optimize with a SATA RAID Storage Solution
Range of capacities as low as $1250 per TB. Ideal if you currently rely on servers/disks/JBODs
Page 3 of 7
A stream can be simple and consumed by a single bolt or it could be complex, require multiple streams, and hence require multiple bolts. A bolt can do most anything, including run functions, filter tuples, perform stream aggregation, join streams, or even execute calls to a database.
Figure 1 shows the relationship between spouts and bolts within a topology.

As Figure 1 illustrates, a Storm topology can have multiple spouts and each spout can have one or more bolts. Additionally, each bolt can exist on its own or it could have additional bolts that listen to it. For example, you might have a spout that observed user behavior on your site and a bolt that used a database field to categorize the types of pages that users were accessing. Another bolt might handle article views and add a tick to a recommendation engine for a unique view on each type of article. You would simply configure the relationships between spouts and bolts to match your own business logic and application demands.
As previously mentioned, Storm's data model is represented by tuples. A tuple is a named list of values of any type. Storm supports all primitive types, Strings, and byte-arrays and you can build your own serializer if you want to use your own object types. Your spouts will "emit" tuples and your bolts will consume them. Your bolts may also emit tuples if their output is destined to be processed by another bolt downstream. Basically, emitting tuples is the mechanism for passing data from a spout to a bolt, or from a bolt to another bolt.
I've covered the main vernacular for the structure of a Storm-based application, but deploying to a Storm cluster involves another set of terms.
A Storm cluster is somewhat similar to Hadoop clusters, but while a Hadoop cluster runs map-reduce jobs, Storm runs topologies. As mentioned earlier, one of the primary differences between map-reduce jobs and topologies is that map-reduce jobs eventually end, while topologies are destined to run until you explicitly kill them. Storm clusters define two types of nodes:
Sitting between the Nimbus and the various Supervisors is the Apache open source project ZooKeeper. ZooKeeper's goal is to enable highly reliable distributed coordination, mainly by acting as a centralized service for distributed cluster functionality.
Recent articles in the Open source Java projects series:
More about Storm: