Recommended: Sing it, brah! 5 fabulous songs for developers
JW's Top 5
Optimize with a SATA RAID Storage Solution
Range of capacities as low as $1250 per TB. Ideal if you currently rely on servers/disks/JBODs
Page 4 of 4
To listen to the hype, you'd think that n-tier architectures are the greatest thing to happen to computing since the vacuum tube. Proponents of CORBA, EJB, and DCOM believe that every new application should be written, and every existing application should be retrofitted, to support their favorite spec. In the universe of distributed objects thus imagined, writing a new application is as simple as choosing objects and sending messages to them in high-level code. The distributed object protocol handles the nasty, low-level details of parameter marshaling, networking, locating the remote objects, transaction management, and so forth.
A good example of a n-tier distributed application is a stock-trading system. In this environment, multiple data feeds (stock quotes, news, trading orders) arrive from different sources, multiple databases (accounts, logs, historical data) are accessed, and multiple clients run specialized applications. It makes sense to weave together the disparate patches in this quilt with the thread of a common distributed object architecture, like CORBA or EJB.
What's the catch? First, there's the learning curve. It takes time to learn how to use a new API, and even more time to learn its ins and outs. Second, there's the product cost. A good standards-compliant distributed object application server, like BEA's Weblogic or IBM's WebSphere, can cost tens of thousands of dollars.
Beyond those material concerns, there are some architectural considerations mitigating the rush towards distributed objects. It's hard to design objects that are truly reusable; the dream of being able to use current work in future projects is often a vain one. Instead, the design and implementation effort you put into making reusable objects is often wasted, with requirements for the next project being different enough to require a code rewrite anyway. Even more important is the fact that, by leaving the safety of the three-tier architecture (UI code goes here, business logic goes there), you run the risk of designing a system that's more complex than you bargained for. This can impede progress, since a careless design decision can have ramifications later on.
Next is the issue of performance. I cannot tell a lie: distributed object protocols are slow. There is no way that an application written in, say, CORBA can be as efficient across the wire as one using a custom-designed socket protocol. Therefore, if your application absolutely needs top networking speed, let your motto be DIY -- do it yourself.
CORBA is slow because it needs to be general. In this sense, its greatest strength is its greatest weakness. The time it takes to marshal and unmarshal parameters, the amount of data transmitted, and its handshaking protocols all suffer from the need for generality. A custom protocol can make more assumptions, and compress data better, leading to higher efficiency.
Please note that this inefficiency may be perfectly acceptable. In a well-designed system, you can make up for it by just adding more boxes. This is what it people mean by describing an architecture as scalable -- a scalable architecture is one that allows your programs to more easily spread out over multiple machines.
Furthermore, if using a distributed-object architecture allows you to write programs that are faster, larger, more powerful, more robust, and just generally cooler, then it's definitely worth it. If you have only two objects interacting (in a chat application, for instance), it may make sense to invent a whole new protocol for them. If you have many objects interacting, however, you'll find the number of combinations increasing exponentially with the number of objects, so you should probably go with an existing standard.
Another pitfall relating to performance in n-tier systems is a little more subtle. Let's say, in a three-tier system, that you're getting millions of hits on a single Web server. You can fix the problem simply by adding more Web servers. This is called load balancing -- you have balanced the million hits between several equivalent servers. You still have a single database, in which each of the servers stores its data. This means that there's no problem if, say, one server writes data and immediately thereafter another server needs to read it. (If needed, you can load-balance the database, too.)
However, in an n-tier application, there are dozens or hundreds of objects interacting, running on many different host computers. If the system is slow, it's not immediately clear which objects, or which hosts, require load balancing. It requires sophisticated analysis of network traffic and log files, as well as plain old guesswork, to isolate the bottlenecks. (And even if you find the problem, you may not be able to do anything about it.) In other words, by increasing the granularity of your object design, you have limited the system's performance.
Let me explain this a different way. In the three-tier, load-balanced system, you know that each of the Web servers is making more or less optimal use of its CPU. A request arrives, and the system churns through it until it's done. (If it needs to wait for a query sent to the database to return, then it multitasks or multithreads and works on a different request.) However, if the chain of communication has to pass to several hosts on the network, then the original server may be sitting idle, waiting for a series of messages to return -- messages that are all queued up in some overloaded remote object somewhere across your network. The front-line CPUs are being underutilized, and the rear-guard CPU is maxed out. And there's no easy way to transfer the idle cycles from one machine to another. If you're unlucky, you may have to go back and redesign your system at the object level to iron out these inefficiencies.
Writing a distributed application can be fun and rewarding, but the right tool for the job is not always the latest buzzword. A developer must understand the advantages and disadvantages of many architectures before deciding on the solution to an idiosyncratic problem. For a summary of these points, see the table at the top of this article.
The authors of this month's server-side Java computing articles will be holding a free online seminar on January 13 at 11:00 a.m. PST. Register to join at http://seminars.jguru.com.
Read more about Enterprise Java in JavaWorld's Enterprise Java section.
Server-side Java: Read the whole series -archived on JavaWorld