C# (pronounced "C sharp") is Microsoft researcher Anders Hejlsberg's latest accomplishment. C# looks astonishingly like Java; it includes language features like single inheritance, interfaces, nearly identical syntax, and compilation to an intermediate format. But C# distinguishes itself from Java with language design features borrowed from Delphi, direct integration with COM (Component Object Model), and its key role in Microsoft's .Net Windows networking framework.
In this article, I will examine common motivations for creating a new computer language, and speculate on which might have led to C#. Next I will introduce C# with regard to its similarities to Java. Then I will discuss a couple of high-level, fundamental differences in scope between Java and C#. I close the article by evaluating the wisdom (or lack thereof) in developing large applications in multiple languages, a key strategy for .Net and C#.
Currently, C# and .Net are available only as a C# language specification (not yet in final form), a "pre-beta SDK Technology Preview" for Windows 2000, and a quickly growing corpus of articles on MSDN. This article is based on those resources and some of my own speculation.
Read the whole series, "C#: A Language Alternative or Just J--?":
- Part 1. What the new language for .Net and post-Java Microsoft means to you
- Part 2. An in-depth look into the semantic differences and design choices between C# and Java
Imagine you're creating a new computer language, and you want to solve some of the traditional problems for C and C++ programmers: memory leaks, difficulty writing multithreaded applications, static linking, illegal pointer references, overly complex multiple-inheritance rules, and so on. To flatten the learning curve, you design the language to look a great deal like C and C++. Then you add garbage collection, integrated thread interlocking, and dynamic linking, you throw out pointers, you allow only single inheritance but introduce the concept of an interface, and so on. Five years ago, Sun Microsystems introduced Java technology, which did those things and was platform-neutral, to boot.
In June 2000, Microsoft preannounced C#, which was designed expressly for its nascent .Net application development framework. In addition to C#, the immensely talented Hejlsberg created the revolutionary languages Turbo Pascal and Delphi while at Borland, but also the counterrevolutionary Visual J++ while at Microsoft. C# and Java address many of the same problems with C and C++. In fact, C# looks so much like Java that you could very easily confuse them.
So why create C# at all? Is C# a "Java wannabe?" Since Microsoft obviously needs to deal with the Visual J++ developers it has left stranded, is C# just "Visual J--"; that is, Java with some new features and without the Sun logo, trademark, and narrow-eyed lawyers? Or is C# a technology that gives Windows developers the functionality of Java, could possibly compete directly with Java, and is useful in its own right?
It's easy to be skeptical of C#, given its almost surreal similarity to Java in syntax, design, and even runtime behavior. It looks almost as if, having failed to corrupt the Java marketplace with proprietary extensions and strategic omissions, Microsoft has simply created a copy of Java, with a new name and a familiar market approach. This is at least not entirely the case: in the context of COM and .Net, C# may well have a place in the world of Windows development.
Motivation for creating a new language
A new computer language could be created as part of a research project, to explore new system architectures or new ideas in programming semantics, or to pull together advances from several other language projects to produce a more powerful language. Innovations in computer technology often change basic assumptions about programming and system development, and new languages arise to take advantage of new ideas. Special applications sometimes require new languages, which are tied intimately to the domain in which they operate. General-purpose languages, however, are usually created either to address existing languages' inadequacies, to fill some business need, or both.
For example, C++ was created as an extension of the C programming language, and was originally called "C with classes." Though innovative and extremely powerful, C suffered from problems with scalability, code fragility, and memory management complexity, among others. C++ was created as an object-oriented approach to solving those problems.
C++ has been widely accepted as a system development language, but its "improvements" came at the cost of increased complexity. C, and to a lesser extent C++, are widely considered to be highly portable, exemplified by the portability of the Unix operating system.
Portability between processors is different from portability between underlying operating system APIs. Different operating systems factor access differently to similar system services. The resulting "impedance mismatch" (to appropriate a lousy metaphor) creates a layer of complexity and potential software flaws in the software layer where the application accesses system services. Anyone who has tried to create, for example, a GUI framework portable across platforms, understands this problem.
Java was created, in part, to address the issues of language complexity, memory management, and cross-platform portability. Java also addresses the business needs of consumers and companies who want to leverage their existing hardware assets, instead of being locked into a particular platform by an operating system vendor. Finally, the rise of the Internet and the ubiquity of network computing make cross-platform portability and airtight security even more important.
C#, announced by Microsoft but not yet released, addresses technical and business problems that Microsoft has recently encountered. Despite several attempts at simplification, the COM object programming framework has never been easy to use, and DCOM (Distributed Component Object Model) adds yet another layer of difficulty. Thus, COM development has been mostly limited to highly trained (and expensive) Windows C/C++ programmers, and Visual Basic users who have taken the time to learn to use a stripped-down interface to COM. The C and C++ languages alone require a great deal of skill to be used effectively and safely; Visual Basic has some object-oriented-like features, but is not a true object-oriented language.
When Java burst onto the scene in 1995, it grabbed an enormous amount of mindshare from Microsoft; people started to talk about a world where an operating system's underlying applications were irrelevant. Java looked so much like C and C++, existing programmers came up to speed in record time. Java also provided cross-platform portability at the operating-system level and addressed many problems that had limited the productivity of C and C++ programmers.
Microsoft initially embraced Java as a language that solved problems with C and C++ while maintaining the training assets of the existing C and C++ programmer base. Unfortunately, Microsoft found that when it tried to extend Java in Visual J++ and tie it more closely to the Windows operating system, Sun hit Microsoft with a lawsuit (see Resources) for violating the terms of its licensing agreement. As a result, Microsoft dumped its Visual J++ product (as well as the developers it had attracted to the tool). There was talk last year of a possible new Microsoft language called Cool, which Microsoft did not acknowledge. Rumor has it C# is that language. (Microsoft still sells Visual J++, but there has not been a new release since October 1998 and Visual J++ has no place in the .Net platform. Java is being integrated into .Net by a separate vendor.)
So what kind of language has Microsoft created? The next section discusses C# in terms of its similarity to Java, since an understanding of Java is common to most JavaWorld readers.
C# and Java similarities
In the grand tradition of programming tutorials that began with C, my comparison of Java and C# begins with a familiar "Hello, world!" example. The code for this multilingual example appears in Table 1.
The similarities between these two simple programs are obvious. Both encapsulate their main function, which is static, within an enclosing class. Both access a global name,
System, that wraps access to system services. The similarities do not end with source code: Java, as you probably know, compiles to byte code -- operation codes in the instruction set of the Java Virtual Machine. C# compiles to MSIL (Microsoft Intermediate Language, formerly known as portable binary format), an intermediate, assembly-like language to which all .Net languages compile. MSIL could easily be called "Windows byte code"; however, just-in-time (JIT) compiling is only one of its design goals. MSIL's design was influenced heavily by the design goal of language interoperability. (Learn more about this in the section entitled Intermediate Language below.)
Usage of code external to a module is handled similarly in Java and C#. Java uses the
import keyword to declare references to external names; C# provides the
using keyword, as shown in Table 2.
The two keywords work in a similar manner; both allow you to use names from another compilation unit without fully specifying the name. Neither C# nor Java use the C preprocessor construct
#include, because the reference to the external module is at a logical, not a lexical, level. This means the external reference is resolved at link time, as well as at compile time. This has special significance for C#, since it allows modules to subclass and operate with modules written in other languages.
The difference between
import in Java and
using in C# is that Java has a concept of packages, which has a specific meaning in the context of symbol accessibility, while C# uses namespaces much like those of C++. The
using keyword makes all names in the given namespace accessible to a module. So, the line
using System; lets you access the .Net runtime namespace
System namespace contains the static global method
System.Console.WriteLine(), which is accessible as
Console.WriteLine() without specifying the
System namespace. (Compare Tables 1 and 2.) In the Java example,
System is a class defined in
java.lang, which is implicitly imported into every Java source file; therefore, the
import statement is not needed. However, including
import java.lang.System.*; does not permit you to omit the
System.out.println as in C#, because
System is a class, not a namespace. Thus, external names are referenced in a way that seems similar, but has different underlying mechanisms. This difference could be more confusing to programmers accustomed to Java than to C++ programmers who understand and use namespaces. Neither option is more expressively powerful; the two languages simply use different mechanisms to disambiguate names.
Simple statements in C# and Java look alike, since both languages descend primarily from C and C++. Table 3 presents common language constructs in C# and Java.