Recommended: Sing it, brah! 5 fabulous songs for developers
JW's Top 5
Optimize with a SATA RAID Storage Solution
Range of capacities as low as $1250 per TB. Ideal if you currently rely on servers/disks/JBODs
Page 3 of 6
Cloud platforms distribute your code across the cloud in different ways. Some platforms put all of your code on every worker and can execute your code on any of those workers at any given time. Other platforms specify workers for given tasks or roles. Sometimes all of a transaction will occur on one worker. Other platforms may optionally distribute even the execution of a single transaction. Regardless of the model, cloud platforms make your application code highly available by distributing and managing it across multiple workers.
When your code is atomic and stateless in nature, it can then reside wherever the cloud platform puts it in the cloud. In an ideal setup, the code can execute anywhere without you or the code having to think about it. At its root, this means that you automatically have high availability. If a given compute node dies, who cares? The other nodes have the code and can fulfill transactions.
What do I mean by reliability? Say you request code to execute, and something bad happens. If your code is reliable, the requested work still gets done; at the very least, the environment does its best to complete it instead of just giving up -- or, worse, losing the work entirely.
There are a number of models for attaining reliable execution in cloud platform environments. If the cloud platform is designed to provide reliability to your code, then you'll likely be allowed to declaratively configure (outside your code) how you want reliability to behave at runtime. Without a cloud platform that virtualizes and watches over your application, trying to write reliable, distributed applications from the ground up can be a lot of work to do yourself
Figure 2 illustrates one reliability model that directly shows the benefits of atomic, stateless, and idempotent code. Say you've requested that your code execute in the cloud, and a failure occurs. Perhaps the worker doing the work suffers a power supply failure. The cloud platform detects the loss of work, and, depending on packaging-time configuration, retries that work on a different worker instead of returning the failure immediately to the requester. The cloud platform then retries that work until success is achieved, or until some configured threshold is met and failure is returned.
If your code takes advantage of the attributes of atomicity, statelessness, and idempotence, then you can have the flexibility to reach for reliability, especially if the environment leverages this functionality for you. Without these attributes, your options are narrowed. For example, consider atomicity in the reliability model just discussed. If the executed code encapsulates multiple non-atomic steps, then the complexity of retrying those steps goes way up. Likewise, if the code is a long-running series of steps, rather than stand-alone atomic steps, then a retry must rerun the entire series when failure happens, instead of just picking up at the step that failed.