Newsletter sign-up
View all newsletters

Enterprise Java Newsletter
Stay up to date on the latest tutorials and Java community news posted on JavaWorld

Sponsored Links

Optimize with a SATA RAID Storage Solution
Range of capacities as low as $1250 per TB. Ideal if you currently rely on servers/disks/JBODs

7 cutting-edge programming experiments worth trying

Get the best from trending technologies like Erlang, Node.js, and Go

  • Print
  • Feedback

Page 4 of 5

R, the language, is distributed through an open source project devoted to nurturing the core. Many developers start with more complete IDEs like R Studio that bundle together editors and output windows with the execution engine. The IDE is the best way to create code that can run on just the core when it's deployed into production stacks.

The trouble with statistical tools like R is that the insights don't always come, and what comes of the experimentation isn't always significant. Just because the thinking is newer doesn't make it better. Big data offers perfectly good theories and even great ideas, but few know just how good they are -- especially in context. Will this kind of statistical analysis really help your product? Will the incoming data have enough precision to allow the theory to work? No one knows, but you might find out if you devote several months of experimentation.

Consider the excitement about using statistical tools like R to slice through the mounds of data piling up in your disk farms. Perhaps you're the lucky one who has data filled with one very strong signal just waiting to be discovered. Most folks find that data mining requires plenty of human intelligence to discover the crucial insights that are buried in the noise. A quick dive into the numbers just yields confusion.

Cutting-edge experiment No. 5: Tapping the speed of NoSQL
Let's face it: We programmers are a lazy bunch. We won't start building something from scratch unless we need to. New tools are usually built around one big new feature. Sometimes there are even more.

The only way to get these features is to embrace these new tools. Many of the new NoSQL databases slip effortlessly into the cloud. They see a rack of machines and work well across all of them. That's why they were built and what they do well. They wouldn't exist if they weren't needed.

There are a wide collection of NoSQL projects that offer slightly different collections of features, and enumerating them and explaining the differences between them is beyond the scope of this article. A few of the more popular tools are Cassandra, MongoDB, CouchDB, and Riak. Some companies are also offering the tools as services. MongoLab and MongoHQ are two that offer to store data using MongoDB. Similar versions are available for all of them.

The ability to respond like lightning and scale almost as quickly are great features that may be worth rewriting all of your code to take advantage of, but one of the reasons these seductions of the cutting edge seem so great is because we haven't felt how they can go wrong. There's usually a dark side, and it often takes a bit of time to discover it -- often by mistake.

The same issues confront NoSQL databases. They're fast, but mainly because they don't offer any iron-clad promises of consistency. They suck up the data and respond with an "All Clear" before they're sure that the data has been written to disk. This may be adequate for many of the websites that traffic in social gossip where a lost status update means little, but it's not ideal for others.


  • Print
  • Feedback