Newsletter sign-up
View all newsletters

Enterprise Java Newsletter
Stay up to date on the latest tutorials and Java community news posted on JavaWorld

JavaWorld Daily Brew

The relational database needs no "defense"

 

Anyone who is deeply enmeshed in a technology feels compelled to defend that technology
when any sort of "threat" (or perception of threat) appears on the horizon, and apparently
Gavin is no different. Sure enough, as people (apparently in this case, myself) start
to talk about approaches to persistence that don't involve Hibernate, Gavin feels
compelled to point to these other technologies using inflammatory terms and a certain
amount of FUD. I felt a certain responsibility to respond, since it seems that he's
taking a direct shot at the db4o articles I've written and discussed before.

(By the way, it's also entirely possible that he's taking aim against ActiveRecord
and Rails, which I don't consider to be an "object database" at all; if that's the
case, then I apologize ahead of time for misunderstanding the intent--and the points--of
the piece. But the arguments he makes seem pretty relevant to the OODBMS-vs-RDBMS
discusison as well, so much so that it was a db4o employee who pointed out the blog
entry to me in the first place. In any event, though, Gavin's piece raises some issues
that deserve to be discussed, regardless of the context of Rails or OODBMSs.)

First of all, let me state quite clearly, the relational database needs no defense.
Take whatever comparitive criteria you like, the RDBMS has been, and will, in the
absence of a nearly catastrophic change to the contrary, continue to be, the choice
of businesses all over the world for storing data in a format that's easily-accessed
from a variety of different systems. The RDBMS clearly "owns" the corporate data center,
from Fortune X's (meaning X can be just about any number you choose to put there)
down through single-person shops. To shake that kind of (dare I say it?) monopoly
would require a kind of technology shift on the scale of the move from the mini- and
mainframe to the PC. Those kinds of shifts don't happen very often, and when they
do, it's because of a huge competitive advantage.

Furthermore, I wil go on the record and say it here: neither the OODBMS nor the HODBMS
(hierarchically-oriented database system, a la the "XML database") makes that kind
of case. Not right now, and probably not ever. They have compelling reasons for existence,
but not so strong a case that they could displace the RDBMS from the "enterprise data"
throne. That said, however, since when does one tool solve all problems? They
have their own raisons d'etre, and to simply say that the OODBMS or HODBMS
should be ignored just because "we've always used an RDBMS" is a crime just as great.

Now, having said that, let's take a look at Gavin's points:

  • "Object databases were a total failure and still are." Actually, he's right,
    from the perspective that the OODBMS clearly has not penetrated the corporate environment
    to the same degree that the RDBMS has. But, by that same token, the RDBMS, nearly
    a decade after its introduction, had about the same degree of success. Ask the folks
    who were around when Oracle 1 was released, and they'll tell you about the criticisms
    leveled at the RDBMS that are, in a startling replay of the past, now being applied
    to the OODBMS today. The first generation of anything is always crap... including
    O/R-Ms. Fortunately for both O/R-Ms and OODBMSs, neither is in their first generation
    stage anymore.
  • "the systems are often not called "object databases" in today's marketing literature,
    but we will call them that anyway, since that is what they are."
    Actually, all
    of the OODBMS vendors are pretty ready to call themselves OODBMSs, and I have to say,
    Gavin, you'd know that if you talked to them for more than, say, 30 seconds, or took
    the time to research the subject and listen to what they had to say. The folks
    who don't bother calling their systems "object databases" anymore are the very folks
    he's defending: Oracle, DB/2, and so on. (Anybody remember "Oracle Objects"? Table
    + sprocs == objects? Oy, what a mess.) But don't feel too bad, Gavin, you're in good
    company--Chris Date himself makes this same mistake (though he at least admits that
    true "object" support in the database model requires features that aren't present
    in todays RDBMS products, not that he's a big fan of those products anyway), so at
    least you're in good comapny. (Again, if you're talking Rails being an "object database",
    total agreement, it's not even close. But in all the years I've been hanging out with
    Dave Thomas, Bruce Tate, Stu Halloway, Justin Gehtland, and a bunch of the other Rails
    advocates/evangelists/lecturers/authors, I've never heard any of them make this assertion.)
  • Object-relational mapping isn't that hard, so there's no need to eliminate it. Sorry,
    Gavin, but the fact is, this remains, and always will remain, a point of difference
    between you and I, and between you and a fairly large number of developers I've spoken
    to over the years at conferences and consulting engagements and classes. For simple
    table-to-class mappings, you're right, it's a pretty simple thing. It is, however,
    still a "dual schema" problem, in that now you have two competing "sources of truth"
    that have to be reconciled to one another, the database schema, and the object model.
    Now, perhaps if all the projects you've ever done are projects where the developer
    gets to define both, then the problem doesn't appear, but if you're in an "enterprise"
    world where the database schema is managed by a team of DBAs and is shared across
    projects, you don't have the flexibility to "refactor" the schema like you can your
    object model. (Anyone who's ever tried to build a CORBA or DCOM system that stretches
    across corporate or division or department boundaries understands the problems of
    trying to create a domain model--or schema--that serves all groups well without sacrificing
    performance, elegance, or normal form.) I particularly like this statement:
    So,
    from this point of view, ORM is at least as good as an object database for all usecases,
    and handles other usecases (indeed, the common cases) which the object database approach
    does not.

    ... particularly since he doesn't bother to go on to describe
    those use cases that the ORM handles that the OODBMS does not. Examples? 'Tis very
    easy to make assertions, but without backing them up....

  • Oh, and the comment that "If you just want to "throw some objects in the database",
    you'll never need to write a single mapping annotation." really sort of proves the
    point I try to make in the ODMG.org paper: if you just want to "throw some objects
    in the database", why do you bother having an RDBMS in the first place? There are
    DBAs that are in open revolt at the idea, particularly since you've also just conveniently
    left out any sort of indexing or other tuning decisions that will make the database
    perform at all reasonably. But, I suppose, if you're willing to argue "development
    speed uber alles"
    , then sure, go ahead. Never mind the fact that an OODBMS will
    handle this exact situation, because that's exactly what they were made for. I repeat
    the statements I made in the ODMB paper: if you want persistence to just be an implementation
    detail, then why bother with the RDBMS in the first place? (It's not like any self-respecting
    DBA is going to want to take your slapdash relational schema, anyway...)
  • Don't use the OODBMS because it creates a tight coupling between your code and
    your data storage, and the language you use today won't necessarily be around tomorrow.
    Um...
    exactly. This is, surprisingly enough, exactly the point I'm trying to make in the
    ODMG paper: that an OODBMS creates a tight coupling between code and data, and sometimes,
    that's not what you want. Nothing is a silver bullet, everything comes with
    a price and a consequence of using it. It's only the honest vendors that will tell
    you when not to use their stuff, and from experience, the db4o guys (the only ones
    I can concretely speak to) are the first to stand up and tell you that they aren't
    trying to replace the RDBMS. So why spread the FUD that they are?
  • OODBMSs are trying to pull the wool over your eyes with benchmarks. So, again,
    rather than display his own benchmark that directly contradicts the benchmark offered
    by the OODBMS folks, Gavin chooses to say, "Look at all the reasons why they run faster,
    and look, these reasons are all clearly bogus." Which is kind of astute of him: lawyers
    are taught in law school that if the law isn't on your side, argue the facts, and
    if the facts aren't on your side, argue the law, and if neither is on your side, argue
    really really loudly. Toss out a benchmark of your own, Gavin, and then we can discuss
    the decisions you make in your benchmark and see if they're reasonable decisions to
    make for my own projects, so I can make an informed decision, rather than one based
    on your assertions and loud arguments that amount to "Duh!".
  • OODBMSs are faster because they run in-process. Some do, yes. Most can run
    either in-proc or out-of-proc, which (gasp!) is something that RDBMSs can do, too.
    Or have you not noticed HSQL and Derby recently? And yes, running the RDBMS in-proc
    performs better than running the RDBMS out-of-proc. Running anything in-proc
    performs better than running out-of-proc. And yes, you're right, sometimes you don't
    have the option of in-proc. But in a situation where you're just "throwing objects
    into the database", and nobody else is connecting to this data (in other words, you
    can be tightly coupled to the data storage), why take that overhead if it's not necessary?
    Choosing an out-of-proc database because "somebody may want to get to this data someday"
    is YAGNI, pure and simple.
  • "... the problem is that existing, mature RDBMS systems happen to not be written
    in Java (see Benefit #3)."
    Ouch. Don't let the Cloudscape developers hear you
    say that. Granted, HSQL is not what I'd call a "heavy-duty" RDBMS, but Gavin, not everything has
    to be stored in Oracle. Sometimes a lighter-weight database--MySQL, HSQL, Postgres,
    or even (gasp!) Access--is good enough. Or are you advocating that everybody should
    be using clustered J2EE servers to build their 5-user department calendar app? (Maybe
    it's a Seam thing, I dunno.)
  • OODBMSs don't scale because they share a lot of state across concurrent threads. Any
    architecture that shares state across concurrent threads will have a hard time scaling,
    but... aren't you the guy arguing that stateful session beans are better than stateless?
    And how is this different from an RDBMS sharing state across concurrent threads? The
    transaction model isn't any different between the OODBMS and the RDBMS...
  • OODBMS benchmarks suck because they measure ORM with caching turned off. As
    well they should, because not all ORM users can use caching. Particularly if they
    need to bypass the ORM for particularly sophisticated straight-up SQL queries. (Unless,
    of course, one subscribes to the belief that HQL or OQL is just as powerful as SQL
    itself, and therefore can do anything that SQL can do...) That said, it's still a
    fair argument, and benchmarks, if they're to be at all useful to the general community
    (as opposed to being just plain marketing fluff), should detail exactly how they were
    run so a technology investigator can re-run the benchmark on their own, see if the
    results match, and tune them as desired to better match their architectural constraints
    or opportunities.
  • "Things that do more stuff are slower". Agreed... but how is this refuting
    the point? If an O/R-M is doing more stuff than an OODBMS, but the end result is the
    same from the programmer's perspective, the fact tha the O/R-M has to do more stuff
    shouldn't be held against it? That's like suggesting that Tonya Harding should have
    gotten a do-over in the Olympics because she was kinda upset about all the bad publicity.
  • "Fetching hierarchical data.... there is no a priori reason why an object database
    should be any faster than an ORM solution for this."
    Absolutely! The problem is
    with the general approach of trying to manage the associations of the object model
    and the fact that the complete object graph (which doesn't have to be a hierarchy,
    by the way) frequently is larger than the programmer wants to pull across the wire.
    (Which is another great reason to look into an in-proc solution: no wires involved.)
    This will remain a problem--pending a perfect solution, which I believe does not exist,
    since the decision whether to eager- or lazy-fetch elements or associations will vary
    on a case-by-case basis--for both the OODBMS and the O/R-M world.

Gavin concludes with this:

If you think that relational technology is
for persisting the state of your application, you've missed the point. The value of
the relational model is that it's democratic. Anyone's favorite programming language
can understand sets of tuples of primitive values. Relational databases are an integration
technology, not just a persistence technology. And integration is important. That's
why we are stuck with them.

Agreed! He makes my point for me: if you
are in a situation where the data needs to be loosely coupled from the object model,
then you need an RDBMS, and you cannot assume that the relational schema can closely
mirror the object model--which essentially makes the point that the relational schema
is the big winner in the dual schema decision (which is a perfectly fine decision
to make, so long as you accept that your object model might suffer in its "purity"
as a result). You have essentially acknowledged the dual schema problem, and chosen
to let the relational schema be core definition. (Arguably, this is the only reasonable
decision to make if your relational schema is fixed ahead of time.)>




Enterprise consulting, mentoring or instruction. Java, C++, .NET or XML services.
1-day or multi-day workshops available. Contact
me for details
.