Like many people, I cheer any sign of Oracle's downfall and chuckle at the thought of Larry Ellison having some misadventure on the high seas. Who doesn't? But the sad fact is that, despite all the buildup, NoSQL and big data do not threaten Oracle or the RDBMS paradigm in the near term.
On the other hand, there's a decent chance NoSQL will take a big bite out of Oracle RAC (Real Application Clusters). Here's why.
[ Andrew C. Oliver answers the question on everyone's mind: Which freaking database should I use? | Also on InfoWorld: The time for NoSQL standards is now | Get a digest of the key stories each day in the InfoWorld Daily newsletter. ]
Oracle RAC enables you to scale your application by implementing load balancing and high availability across multiple nodes. It scales well for your everyday read-mostly-mostly (the second "mostly" is deliberate) clusters but not as well when you have a few more writes. To support more nodes, it must coordinate a write-lock across the network.
To take advantage of RAC, you need to write your application very well and very carefully. For the majority of applications, the cost of that development (or the horsepower to run less carefully written apps) will be high.
RAC your world, for a price
About 1 percent of systems experience backbreaking traffic, and in cases that demand high transaction throughput, RAC may well make sense. But RAC doesn't add up for the remaining 99 percent.
There's another cost your average Oracle expert won't tell you about: Oracle RAC, like Oracle RDBMS itself, requires lots of extra care and feeding.
Oracle RDBMS, frankly, has the kinds of bugs that are hard to imagine in a mature product. Try launching a site with 8 million concurrent and very active users where some patches that can't be avoided have yet to be put into place. What if, on your rolling update, every time you restart an app server node, Oracle leaks memory? When you add RAC, you get new bugs and problems to boot. It's hard to find independent studies on maintenance costs, and vendors have long learned to manipulate total cost of ownership (TCO) measures, but ask anyone who has deployed this beast: It needs a lot of love.
We haven't even gotten to the costs for RAC yet according to Oracle's price list:
Oracle Enterprise Edition per CPU
RAC per CPU
For a three-node cluster with four cores per node, you're looking at several times the cost of my house. Oracle pricing is like cellular carrier pricing: Hand over your wallet and they'll take what they want. There are about 100 line items of indecipherable fees. You owe what they say you owe. For this theoretical cluster, at retail, we're looking at no less than $160,000, all before installation. (This price doesn't include all the small line items I don't understand.)
If RAC is software for the upper 1 percent, why is it a best seller? Recall my statement that it's uncommon for people to write their software well. In other words, most software makes inefficient use of the database and taxes it beyond what you would expect. Therefore, you have to scale with more CPUs and more disks, which is what RAC is intended to help you do. In the event your reads/writes and locks aren't efficient, throwing hardware at the problem may not increase performance as much as you expect.
The other big reason for RAC is high availability. If a node goes down, you want to stay up. You don't need heavy load for a complete database outage to cost you a lot of money.
RAC's feature set, higher scalability, and high availability are part and parcel of many modern NoSQL databases. Consider Couchbase, which is adding a document database to its popular offering with the new 2.0 release. With full 24/7 support, we're looking at $4,500 per node. That adds up to $13,500, which is maybe three times what I paid for my motorcycle but less than I paid for my car. As for MongoDB, currently the most popular document database, we're looking at $4,000 per instance of mongod (you may end up running more than one instance per node since Mongo is single-threaded).
From an operational store standpoint, where quickly handling short reads and writes concurrently is the most important priority, RAC may scale to your needs, but at a price point that doesn't make much sense. Moreover, Oracle hasn't become any easier to maintain, and setting up for "five nines" availability would require a much more advanced system than I've priced out here.
For analytical systems, the cost will be higher. Fortunately, your need for RAC is much less urgent here because these systems tend not to be real time and mere replication is often sufficient. A Hadoop cluster might be more appropriate. Neither of the two largest Hadoop vendors seem to disclose their pricing publicly, but it's hard to imagine it's worse than building a RAC system of the same approximate scale.
Moreover, as we head to the clouds, what's more partitionable and affordable than autoscaling/autoprovisioning new nodes as traffic demands?
If you don't need RAC, maybe you don't need Oracle
A key advantage of the Oracle RDBMS is that if you design your schema and structure your queries just right, you can heavily parallelize analytical, ETL, and batch jobs. A possible corollary to "it's uncommon to write your software well" is that it's uncommon to design your schema well, especially if you're trying to have one master model of all of your data. But if your problem lends itself to this kind of parallelization, it probably also lends itself to MapReduce.
Even if you have a great fit for the RDBMS, you aren't doing some highly parallelizable star-schema crazy thing, and you're not using RAC, you can use a much less expensive RDBMS such as PostgreSQL or even SQL Server (which last I checked was about one-fourth of the cost). For operational systems, the best thing your RDBMS can do is manage concurrency and get out of the way while you try and hit the disk (or maybe the cache) as quickly as possible.
Nonetheless, Oracle will have a long tail, because it's entrenched -- and enterprise customers have undertaken ill-advised actions, like writing PL/SQL triggers. Migration will take time, and in some cases, spending a few hundred thousand dollars more is less painful than the cost of the refactoring.
But my original advice holds. Anyone looking at Oracle RAC for a new system should probably take a much deeper dive into NoSQL and their data problem. NoSQL is a RAC killer.
This article, "NoSQL is no Oracle killer," was originally published at InfoWorld.com. Read more of Andrew C. Oliver's Strategic Developer blog, and keep up on the latest developments in application development at InfoWorld.com For the latest business technology news, follow InfoWorld.com on Twitter.
This story, "NoSQL is no Oracle killer" was originally published by InfoWorld.