Standing on the shores of the software development industry today, we see an ocean of endless code. The average developer spends most of his or her time coding or supporting code production. Developers earn respect based primarily on their coding abilities and obtain jobs based on the languages they know; a super developer is really a super coder.
The two major productivity advances of the 1990s were both code-centric: Java, a new language to produce even more code, and the Internet, which required an abundance of code to implement. The industry's problems are rooted in code management; they include code quality (bugs), code completion time (schedules), code capacity (how much you can produce), obsolete code (legacies), and, among many others, complexity in the ever-growing bodies of code. In short, the industry is so code-centric that it sees most problems and opportunities as best resolved with more handwritten code or more efficient code-production activities, such as requirements, design, and testing.
Before we delve into the problem, consider one insight: Brilliant tactics cannot save bad strategy.
The software industry has been searching for solutions to the code productivity problem for decades, as shown by attention to software engineering lore, code production process, coding tools, and coding languages. Just as Fred Brooks predicted in his 1986 essay "No Silver Bullet: Essence and Accident in Software Engineering" (see also the sidebar at the end of this article, "A Dissenting Opinion from Fred Brooks"), nothing has helped in the range of an order of magnitude:
But as we look to the horizon of a decade hence, we see no silver bullet. There is no single development, in either technology or management technique, which by itself promises even one order-of-magnitude improvement in productivity, in reliability, in simplicity.
Guess what? We're focusing on the wrong problem. The real problem is not productivity at the code/developer level, but at the system/user level.
Code-centricity is a dangerous myth because it stifles productivity and prevents fundamental progress in the science of computation. If we don't use drastically better alternatives, the software industry runs the risk of becoming a dinosaur relic, replaced by up-and-coming industries such as biotechnology and electromechanical nanotechnology, or by new ones not yet on anyone's radar. Congruence -- how closely two things match -- is at least one way out of the tar pit of code-centricity.
Another myth is that only radical innovation can achieve order-of-magnitude advances. Not so. Slow but steady improvement over a long period of time can have the same effect. In fact, most improvement is by evolution rather than revolution. This article presents nothing revolutionary or high risk. Instead, it proposes modest evolutionary progress by using bits and pieces of today's technology, focused tightly in the right direction.
We'll explore the problem by learning from the past. Software development productivity trends are best understood if traced through successive generations, starting in the early 1950s. A new generation arose approximately once per decade. Table 1 classifies each generation by its proven productivity.
|Generation language||Description||Examples||Key limitation|
|First (1GL)||Machine code||1101 0111 0011 0010||It's hard to think in binary.|
|Second (2GL)||Assembly language||Rather tedious for general use. The sample shown is from the Intel 8086 family of processors (comments not shown).|
|Third (3GL)||Procedural languages||Java, C, Python, Perl, COBOL, BASIC, Smalltalk||Low productivity; historic average of 10 lines of code (LOC) per day per developer for the first three generations.|
|Fourth (4GL)||GUI tools with direct WYSIWYG||Drag-and-drop GUI builders, Dreamweaver, PhotoShop||Cannot handle most domains well because, in most cases, product does not equal image.|
Each successive generation increased productivity by an order of magnitude. What can we learn from these generations? Why did they increase productivity so dramatically? Because each generation allows higher expressiveness through closer tool congruence.
Note that I have classified generations by quantum increases in productivity, not by conventions such as language constructs. Thus the four generations you see above don't match those defined in conventional wisdom. This classification prevents false generations from fooling us into thinking we have advanced.
The problem of tool congruence
As mentioned earlier, congruence is a measure of how closely two things match. Expressiveness is a measure of how well a tool lets a user accomplish a task. We each have a mental model of anything we build, including software systems. Tool congruence is how closely a tool's external appearance and use matches that mental model.
Thus, the more congruent the mental model and visual image, the more expressive a tool is. The more expressive a tool is, the more it becomes transparent to the user, and therefore the more productive it is. A well-designed tool disappears altogether if it completely merges your mental and physical activity.
For example, suppose you need to write some narrative text. Any reasonable 4GL text editor will let you do that in WYSIWYG style. This is 100 percent congruence because your mental model of paragraphs and sentences is the same as what the tool shows.
But suppose you must write a complex SQL statement with multiple tables and fancy joins. Your mental model is that of several tables with certain fields joined to other fields. This is not what a SQL line shows. Thus, using a text editor to write SQL results in a classic impedance mismatch and has low congruence. In this example, you can achieve high congruence with a GUI tool that allows modeling the SQL statement by dragging tables around and connecting fields to define joins. But not all domains can achieve congruence.
Currently, the most heavily used tools are either 3GL (such as code editors for C, Perl, Python, and Java) or 4GL (such as Dreamweaver, Visual Basic, and JBuilder). Visual Basic and JBuilder are a mixture of 3GL and 4GL. For more than 10 years, we've been stuck with 4GLs; the march of software generations has come to a roadblock.
4GLs are easy if product equals image. For example, you may need to write text, HTML, or a GUI, so the tool shows the result as it's created. When product doesn't equal mental image, 4GLs are difficult or impossible to use because, for example, they can lack the logic. Here the product is code but your mental image can include many things, such as a web of objects with associations, groups, and states. Since a code editor shows an image of text, the tool has very low congruence and expressiveness. The inevitable result is that coding requires a high skill level, goes slowly, and has frequent bugs.
Expressiveness is stuck at coding today; we've hit a wall because product doesn't equal image. Monumental attempts have been made to improve coding congruence, with modeling (like UML), unit assembly (such as IDE widgets and data icons), and even synchronized UML models and code (TogetherJ). However, we still produce and manage code, code, and more code.
For further proof of this stagnation, can you, in two seconds, name the most popular software metric? Too easy: it's lines of code (LOC). Even though function points are far more accurate, LOC predominates. In fact, the industry has made the interesting discovery that, regardless of language, the average output per developer day is about 10 LOC. How ironic to know so precisely where our field is stuck.
The chemistry field in the late 1700s encountered the same problem; it was "full of facts and techniques but no way to make sense of them" (Galileo's Commandment, edited by Bolles, 1997, page 379). Antoine-Laurent Lavosier introduced a new way to view chemistry: he reformed the language, redefined central concepts such as "elements," and gave chemicals new names. He forced clearer thought and understanding by merely introducing new viewpoints and clarifying central tenants. We can do the same to plow through the 3GL and 4GL roadblock using today's technology.
Our task is made easier, however, since we know where to apply leverage: increase congruence. The problem of congruence is: How can we organize modern software so that its physical structure and construction is the same as its visual viewpoint?
Solving the tool congruence problem
The latest high productivity initiatives from Sun and Microsoft (the Java 2 Platform, Enterprise Edition (J2EE) and .Net) confirm that we have entered the age of highly reusable components running in container frameworks. These systems are configured not by code but by declarative knowledge (DK), such as XML or HTML, which declares what to do. Procedural knowledge (PK), such as code, tells how to do things. J2EE and similar frameworks separate DK from PK; Table 2 describes their key practices.
|Practice||Description||Why so productive?|
|Components||Software units designed for reuse in a variety of contexts.||Allows bodies of code to be loosely coupled and highly cohesive. Causes higher reuse, easier maintenance, and third-party code production. Higher reuse leads to higher quality, lower cost, and faster time to market.|
|Container framework||Software system that achieves its domain-specific behavior through the components it contains and manages.||Components no longer need to manage their own or other component's lifecycles and interactions. The system designer is also relieved of that task.|
|Standard services||A set of widely needed services available to components. Usually used directly (component managed) but sometimes transparently (container managed).||Many components need the same behaviors, called services. Standard services relieve the system designer from figuring out again and again how best to provide certain behavior.|
|Separation of DK from PK||Behavioral knowledge is broken into two types: data that declares behavioral policy (DK) and procedures that define behavior (PK).||This is the most important practice. Each PK unit (a component) is designed so that DK defines what varies in a particular reuse case. This lets you use components in a larger number of contexts than would be possible if you couldn't customize a component's behavior at system assembly time. Think of PK as "what you reuse" and DK as "reuse case policy."|
J2EE uses DK for deployment description but excludes detailed bean configuration with DK (though a bean can do this using a reference). I recommend DK for components as well as system configuration, and view systems as large components.
Congruence with visual tools is not yet emphasized in comparatively young frameworks like J2EE and .Net. Exposure to component assembly tools and study of UI design principles shows that to obtain congruence, we must follow one simple strategy: use a mental model with which users already build systems. Such simplicity!
That mental model is assembling big things from little things and connecting them to make them run. We've all seen this done before. A city is basically buildings connected by roads and utilities. A house is constructed of many standardized parts (lumber, plywood, concrete, rebar, appliances, and so on) that are assembled with standardized connections (nails, screws, glue, wiring, and so on). Even a corporation is constructed of people, organized into groups, configured with training and policy, and connected with lines of authority and communication. For lack of a better phrase, I call this ubiquitous ability system assembly from parts and connections.
System assembly is inherently a visual process because we already think in spatial mechanics as we assemble or examine things. Thus, system assembly is easily shown and used in visual tools.
Solving the simpler problem
The problem now becomes much simpler: How can we organize modern software so that its physical structure and construction is one of system assembly? To borrow heavily from biology and electronics, we add only a few more key practices -- connections, anonymous collaboration, and hierarchical composition -- to the previous four. The new practices are outlined in Table 3.
|Practice||Description||Why so productive?|
|Connections||A formal element representing an interaction between two components, preferably at the method level for more fine-grained system assembly and higher reuse.||Allows the user to "wire up" a system by connecting components as desired, as opposed to components doing it themselves, as in traditional OO design. This allows component interactions to more precisely support desired functionality and much higher reuse. Connection types, such as remote versus local, are now easier to make congruent. The effect is that connections are now first-class elements, alongside objects.|
|Anonymous collaboration||Two or more parts using each other without knowing about each other, such as with messaging. Requires a mediator, such as a framework. Best done with connections.||Suppose Part A uses behavior in Part B. If Part A has a reference to Part B, then you can only reuse Part A if Part B (or anything with the same interface) is present. If Part A instead sends a message that is handled by Part B, then you can reuse Part A without Part B because other parts, without knowing Part B, can handle the message. In short, this greatly enhances loose coupling.|
|Hierarchical composition||Organization of parts into a system tree of containers and leaves. Uses the Composite pattern.||Allows very intuitive large system understanding, navigation, and management. Lets you assemble the large from the small. Best of all, if you use anonymous collaboration, then reuse is a byproduct of system construction, because you may reuse (harvest) any branch in the system tree as a component.|
Voilà! Problem solved via a borrowed solution, system assembly. The solution gets a tentative name and acronym, system imagery (SI), and, just like that, it becomes the long sought after fifth-generation tool (5GT). You can build SI with today's technology, with no pie-in-the-sky promises like the previous fifth-generation contender, AI.
But, first, you may have noticed a flaw in SI: it only works if most software development requires no new code. So far, such a situation has never been achieved. But from my experience, with strict separation of DK from PK, the use of anonymous collaboration (messaging) between parts, building most parts from parts, hierarchical composition of systems, and the use of domain-neutral (J2EE and .Net are not domain neutral) foundational layers, carefully architected collections of components built with PK can be configured with DK to handle a large variety of contexts. This allows new systems to be created with no code, except for the necessary new parts not yet in inventory.
Strategic proof lies in what biology has accomplished with DNA and millions of species. All plants, animals, and bacteria are built with DNA and a mere 20 different amino acids. Using the information in DNA, different arrangements of amino acids are composed into long chains, which are proteins; proteins are arranged into cells; cells are arranged into groups; groups are arranged into larger groups; and so on, all the way up to a species instance. At the abstract level, the DK of DNA defines how to build the system from minimal parts. SI can follow the same strategy of using DK and a reasonably small number of parts to hierarchically construct surprisingly complex systems. This proves that SI's general strategy is valid, though finding the first functional technique will take many iterations.
Until you've designed or used many DK-driven parts using anonymous collaboration, you may have no knowledge of their nature. For example, many applications have multistep procedures like checkout, order entry, questionnaire, or purchase-order approval. DK could specify the order of steps, and a reusable
WorkFlow part could manage them.
WorkFlow would receive an anonymous message when another part completed its step. It would then determine the next step, if any, and send a message out to start that step. SI's infrastructure makes this reuse scenario easy, compared to the object-oriented development approach. As soon as behavior patterns such as
WorkFlow are spotted, they are embodied into high-reuse parts. In this manner, large amounts of intellectual property flow into part inventory and DK, rather than remaining in people's heads or buried deep in application-specific code.
SI: A new development model
SI allows three main specialties in software development:
- System assemblers: Using advanced, easy-to-use tools and a vast part vocabulary, users and developers assemble systems in desired domains. Behind the scenes, the tools edit DK. New parts are ordered, if needed. Assembly eventually becomes the dominant activity, taking more than 99 percent of our time. (Assemble systems and assembly are other terms for SI software development, and include system creation, maintenance, and customization.)
- Part builders: New base parts are created with PK (code) or composed from old base and composite parts with SI. As each new domain is entered, new parts are needed. But these become fewer and fewer until an asymptote is reached. This asymptote should be quite low, since, for example, we can talk and write about most new domains without creating new English words. Custom part configurators (like property sheets) are needed for some parts, and are built with SI.
- Tool builders: Generic tools to assemble systems are built using SI. This is a case of infinite recursion: SI is used to build SI when new SI tools are built using old SI tools. There is a small amount of handcoding in the container framework's domain-neutral layer. We can expect some domain-specific tools.
In SI, coding is minor because only new base parts and the framework's domain-neutral layer is handcoded; the rest of the process involves editing DK with tools. Some automatically generated code -- such as remote stubs, skeletons, part wrappers, database facades, and optimization -- will exist as well.
Gone is the concept of glue code. The original, naive intent of glue code was to stitch together reusable chunks of code with small amounts of finely crafted code. In practice, it takes large amounts of code to achieve this. But the strategy was sound: stick reusables together with something to assemble systems. That role is now played by DK instead of glue code.
DK is data, not logic, so by nature it's more amenable to editing with visual tools in a look-and-choose, WYSIWYG manner. In fact, DK is so much easier to edit with WYSIWYG tools than code that, with today's technology, you can achieve high tool congruence in all domains only if your tools edit DK.
Fair warning: It takes a new mindset to build ultra-high-reuse, configurable software part suites. For example, you must design DK before PK, compose most parts from smaller ones, use anonymous collaboration instead of direct object method calls when coding base parts, and offer remote and local part versions when necessary. This is a whole new world for most developers.
SI has gone beyond a "language," and has surmounted the limiting concept of "language generations." No longer can we think in terms of system LOC or LOC per day. Instead we have a cohesive set of practices implemented as a single technology, presenting a unified face to the user in the form of a realtime, easy-to-use GUI system assembly editor. This is not to be confused with an IDE, because it has no facilities for coding parts, which are developed separately.
To summarize, SI is a true 5GT. Users or developers characterize SI's 5GT using natural mental models to visually understand, assemble, maintain, and customize software systems. There is no coding except when new base components are built; this happens much less frequently than assembly. Since product equals image, we achieve high congruence and expressiveness. This has the clear potential to increase developer productivity as much as any software generation has.
I have more good news: SI empowers all users, not just developers, to become tool builders. (A system you have built is a new tool.) That fact alone has the potential to increase productivity by an order of magnitude. But SI is not a miracle. Complex systems or those requiring tricky standards due to the need to support many users will still need development specialists, with the option of managing or being managed by users who do most of the system assembly. Actually, what we may need most is for users to customize their software to suit their own needs.
The conceptual whole
Pursuit of congruence has led us to seven key practices: components, container framework, standard services, separation of PK from DK, connections, anonymous collaboration, and hierarchical composition.
Now for the next step: merging the practices into the best possible design. This is the hardest step because it requires new abstractions, exploits the multiplier effect, and can be done in a million ways, most of which are wrong. Fortunately, my experience with three generations of experimental system assembly tools (Bean Assembler (BA), Ultra High Reuse (UHR), and Visual Circuit Board (VCB); see Resources) provides a wealth of guidance.
The multiplier effect is the net result of two or more collaborating elements. It is also called leverage or the gestalt whole. With good design, the resulting whole is much greater than the elements working alone. Bad design will result in a modestly greater outcome, or even a result inferior to that achieved by the units working alone.
To maximize the effect of these practices, we must melt them into a conceptual whole that best multiplies the effects of individual practices. The result is a minimum architecture concept map. This map clearly shows the conceptual whole of key practices working together, the roles that we must play, and how we can accomplish SI with architectural simplicity. Figure 1 demonstrates the map.
Let's look at each item in the figure in turn:
- Components: Labeled Part on the diagram. This provides the PK for the system domain.
- Container framework: Everything in yellow. The main plug points are the Part and Standard Service interfaces, just as in J2EE. The main difference is that the Part interface has methods for pins and for receiving DK.
- Standard services: Labeled as such. This concept is identical to J2EE's standard services.
- Separation of PK from DK: DK is pink, PK is yellow. Note that DK varies the mission of parts, standard services, and the engine, which is all it takes to define a system. Think of DK as the element from which a system emerges.
- Connections: A link from a part outpin to a part inpin. A part has zero or more inpins and outpins. To keep the diagram simple, outpins and inpins are labeled Pins. A Pin is just like a pin on an electronic part, while a link is like a wire connecting two pins on two different electronic parts.
- Anonymous collaboration: Done with Message, Pin, and Link using the Command and Observer patterns. SI parts collaborate with other parts anonymously by sending and receiving messages. These messages flow through links defined in system assembly. A message is like a collection of key values, such as a hashtable. A part never knows where a message came from or where it's going.
- Hierarchical composition: Container and Part using the Composite pattern. A system has a root container. A container has zero or more parts. A part can be a leaf or container, so a system is a tree of parts. Messages flow along the tree's "branches" from part to part.
Plus: Tools are needed to accomplish the practices efficiently. Tools are SI systems created by SI tools, except for the primordial tools where DK was edited by hand.
Very briefly, here's how it works. A system is really just a root container, which contains many parts. Some parts are containers of more parts, so a system is a tree of containers and parts. Parts and the engine use services to do common types of work, such as transactions, persistence, and concurrency management. Parts use other parts indirectly (anonymous collaboration) by sending and receiving messages. A message travels out of a part's outpin, along a link defined by the system assembler, to another part's inpin. Pins can have many links, which means messages are multicast.
To users, SI consists of tools, systems, and parts. They use tools to assemble parts into systems and use systems to do work. The tools edit DK, not code. This lets DK vary the mission of parts, services, and the engine in different reuse cases. That's the real magic -- strict separation of DK from PK, along with anonymous collaboration, let you rearrange or configure the same parts in an infinite number of ways, to build an infinite number of types of systems. All this gives the system assembler incredibly high expressiveness and reuse, which is what makes SI so productive.
You may have a fuzzy conception of how this architecture could work. Figure 2 shows a small example of a system's root container.
The root container shows the topmost view of the entire system with all subelements encapsulated as parts. Figure 2 does not show additional tools such as part inventory, animation, or a part configurator (like a property sheet). The container has four parts, connected with four links. Inpins, such as Read, are left justified; outpins are right justified. Notice how links go from an outpin to an inpin. At the moment, in Figure 2, the user is animating the system to understand its dynamic behavior. A message has left the Start outpin in
ThisDevice and traveled to
ConfigPart1, which acted internally and then sent a message out of
GetConfig. The message is currently paused at the small blue rectangle, which in the tool is on a moving ant line link. The down arrow on
Device1 signifies that you can drill down into it because it's a container. The lightning bolts signify pins on which the user can double click to understand or test various behaviors. The
ConfigStore part is selected, and could be deleted, copied to the clipboard, stretched to a different size, or dragged to a new location with its links following.
This is merely one possible implementation of the above architecture. It shows that you can use this architecture to design an intuitive SI GUI. It also demonstrates that system assembly is a totally different, higher-level way to work compared to coding. It should also clarify what containers, parts, pins, links, and messages are. This first pass at SI's key aspects will need refinement, but the guiding principle of congruence, the seven key practices, and the above architecture provide a solid initial foundation.
Moving on after decades of code-centricity
The example of SI destroys the myth of code-centricity by providing at least one better alternative. But will the software industry wake up and adopt a major new direction? Sadly enough, history shows that most industries don't. Instead they deny their fundamental problems and continue with head-in-the-sand behavior. It is too soon to tell what will happen in this case.
In 5GTs like SI, the emphasis is no longer on code. It's on how well the user interface can match the user's mental model and work habits. In a nutshell, emphasis is now firmly on congruence. We need to start at the user's mental model, work backward to the user interface, and work still further backward to the software architectures needed to support congruence. Please note that this is not the same as UI-centricity, because we start at the mental model and overhaul the architecture to match that model as transparently as possible.
Some may claim that SI is still code-centric. Rubbish. SI users (including developers) will talk excitedly about favorite parts, tools, clever connections, systems they assembled in a day, cool sample systems, and part patterns on the Net. More than 99 percent of our time will be at the abstract level of what our tools are showing us.
For my vision of the path SI will blaze to post-fifth-generation tools, check out the "Vision of the Post-Fifth-Generation Tool" sidebar at the end of this article.
100 percent pure happiness
A note about what matters most: not only is productivity a problem, so is human happiness.
How many times have you heard someone gripe about his or her software? How many software tools have you grown to "love to hate" because of bugs, low usability, broken promises, and high prices? How many user communities do you suspect lack the tools they need and suffer in a make-do fashion, because what they need is not profitable for vendors? How often does programmer burnout strike? How many users and programmers are frustrated rather than happy? How much of this is due to consciously or unconsciously having to fight with an incongruent tool?
Someday software use will be as common as a smile, as computers penetrate every detail of our lives, in all countries and economic levels. Should system creation be limited to highly trained developers or should computer literacy encompass the ability to create most of your own systems?
Learn more about this topic
- Fred Brooks' 1986 essay "No Silver BulletEssence and Accident in Software Engineering" is reprinted in chapter 16 of the anniversary edition of the The Mythical Man Month (Addison-Wesley, July 1995). Chapter 17 is Fred's own review of the effect of that article
- The Bean Assembler (BA) was the first generation
- Ultra High Reuse (UHR) was the second generation
- Visual Circuit Board (VCB) is the third generation
- ClickBlocks (see sidebar)
- Subscribe to JavaWorld's weekly email newsletters
- You'll find a wealth of IT-related articles from our sister publications at IDG.net