Recommended: Sing it, brah! 5 fabulous songs for developers
JW's Top 5
Optimize with a SATA RAID Storage Solution
Range of capacities as low as $1250 per TB. Ideal if you currently rely on servers/disks/JBODs
Page 6 of 7
The smartest people approaching this problem quickly realized they could make their job dramatically easier by cutting corners and blithely ignoring any glitch. If some status update disappeared, who would notice? If somebody checked in to a service while at a coffee shop and failed to be crowned mayor of that coffee shop, it wasn't a big deal because they would probably return again tomorrow. After the new class of data caretakers recognized that they could save a fortune on compute cycles and infrastructure simply by loosening requirements, they started building NoSQL and other so-called data stores.
Now, saving time and money by trading away accuracy rules the Web. Try searching for an older email message with some of the Web-based tools. They're quietly leaving some of the older ones out of the index. This often reflects a slow erosion of standards for search. Google, for instance, quietly ended the ability to use true boolean searches with the plus sign. Expect to see more and more Web engineers subtly tossing aside the fanatical commitment to accuracy once common among database administrators.
Programming trend No. 10: Real parallelism begins to get practical for all
Computer architects have been talking about machines with true parallel architectures for years, but the programmers in the
trenches are just starting to get the tools that make it possible.
The parallelism is appearing in two prominent areas: multinode databases and Hadoop jobs. Some mix the two.
Most NoSQL data stores offer to help spread the workload over multiple machines. Some offer automatic sharding, which splits the data set into pieces, synchronizes the machines that host a given piece, and directs queries to the right machines as necessary. Some offer duplication or backup, a feature that's a bit older; some do both.
Hadoop is an open source framework that will coordinate a number of machines working on a problem and compile their work into a single answer. The project imitates some of the Map/Reduce framework developed by Google to help synchronize Web crawling efforts, but the project has grown well beyond these roots.
Tools like this make it easier than ever to toss more than one machine at a problem. The infrastructure is now solid enough that the enterprise architects can rely on deploying racks of machines with only a bit of hand-holding and fussing.
Programming trend No. 11: GPUs trump CPUs
Was it only a few years ago that the CPU manufacturers created the chips that fetched the most money? Those days are fading
fast as the graphics processors are now the most lustworthy. It's easy to find kids who will spend $300 on their entire computer
and operating system, then $600 on a new video card to really make it scream.
The gamers aren't alone in their obsession with video cards. Scientists who need high-powered computation are reprogramming GPUs to analyze protein folding or guess the secrets of the smallest particles. Nvidia runs conferences for nongamers using the devices, and they're selling video cards by the palletload to scientists who want to build supercomputers. Oak Ridge National Laboratory, for instance, plans to put 18,000 Tesla GPUs from Nvidia into one room so that they can call it the fastest supercomputer. They're presumably going to build elaborate models for the Department of Energy, not to brag about the frame rate they get while playing Doom.