C#: A language alternative or just J--?, Part 2

The semantic differences and design choices between C# and Java

In Part 1 of this series, I covered the similarities between C# and Java, explained C#'s role in .Net, speculated on the possible outcomes of using multiple languages in application development, and looked at some high-level differences between Java and C#. This article covers C# language constructs in depth, points out various language features, and provides short pieces of sample code that illustrate C# programming principles. It concludes with a rant on C# and the tension between proprietary interests on one hand and open standardization on the other.

C# features

Some of the differences between C# and Java are simply cosmetic. In particular, there's the question of capitalization. I don't know for sure, but I'm guessing that the capitalization conventions in C# come from Delphi, as I suspect the WriteLine method also does. The capitalization in C#, with its Main() method and string built-in type, feel odd to the Java programmer. Are these conventions simply lifted from Delphi, or are they used rather to make C# look, on the surface, less like Java? If the reason was the latter, I'm afraid it wasn't very successful. The effect is rather like putting lipstick on your dog. He's still your dog, only with lipstick, and he doesn't necessarily look any better or worse for it.

Fortunately, C# has several features that go beyond the cosmetic. Some of these, such as enumerated values ("enums"), are simple "syntactic sugar," providing self-documentation and arguably clearer source code. Other features, such as delegates and events, are quite useful (though implemented in a confusing way) and provide functionality built into the language that, in Java, requires coding and a nontrivial understanding of Java design patterns.

This section will cover several interesting C# features.

Type system

Several features of the C# type system are interesting. Primitive values can always be treated as objects. There are a few class member types (properties, structs, enumerations, and delegates) that do not exist in Java. C# arrays work differently from Java arrays. Operators are overloadable in a fashion similar, but not identical, to that of C++. You can access indexed collections through a mechanism called an indexer, which works similarly to an operator.

Primitive/object unification

One feature of the C# type system that is bound to be popular is the use of all primitive types as objects. In fact, you can use primitive literal values, such as strings and integers, as if they were objects, without first constructing a String or Integer object in code, as is necessary in Java. So, for example, the Java code:

Integer iobj = new Integer(12);
System.out.println(iobj.toString());

could be expressed in C# more clearly as:

Console.WriteLine(12.ToString());

You'll notice in this example that the literal 12 can be used as an object. Since every object in C# is a subclass of class object and every literal value is an object in its own right, it is always possible to call any method of class object against a primitive variable or primitive literal value. Many Java programmers feel that the distinction between primitive value objects and primitive values is an unfortunate one in Java, and the designers of C# apparently agree. The processes of automatically converting a primitive value into an object and vice versa in C# are called boxing and unboxing, respectively.

Array types

Array types differ somewhat between C# and Java. Both languages support simple and multidimensional arrays. In both languages, multidimensional arrays can be arrays of arrays, as shown in Table 1.

JavaC#
int[] arrayOfInt = new
int[10];
int[] arrayOfInt = new
int[10];
int[][] multiArray = new
int[10][3];
int[][] multiArray = new int[10][];
for (int i = 0; i < 10; i++) {
   multiArray[i] = new int[3];
}
// Valid in both
C# and Java
int[][] multiArray2 = new int[][] {
   new int []       { 1 },
   new int []    { 1, 2, 1 },
   new int []   { 1, 3, 3, 1 }
};
int[][] multiArray2 = new int[3][];
multiArray2[0] =     { 1 };
multiArray2[1] =  { 1, 2, 1 };
multiArray2[2] = { 1, 3, 3, 1 };
Table 1. Java and C# array definition and initialization

Table 1 shows the differences in initialization syntax for Java and C# arrays. Oddly, Java's initialization syntax is more compact than C#'s, but C# has rectangular arrays that Java lacks. The following example shows the definition of two rectangular arrays in C#:

int[,] r2dArray  = new int[,] { {1, 2, 3}, 
{2, 3, 4} };
int[,,] r3dArray = { { { 1, 2 },
                       { 3, 4 } },
                     { { 5, 6 },
                       { 7, 8 } } };

You'll notice that the second preceding example, r3dArray, uses an acceptable shorthand (omitting new int [,,]) to initialize the array. The C# specification states that the element type in the array and the number of dimensions define the array type, but the size of each dimension does not. For example, int[4][3][5]and int [3][2][1] are the same type, because they have the same base type and dimensions.

Arrays are reference types, meaning they do not copy on assignment. All arrays inherit from the System.Array base class, and therefore inherit all of that class's properties, including the Length property. Since C# arrays are collection types, you can use them with the C# foreach keyword.

Indexers

A class's indexer lets you access any instance of that class as if it were an array. A class may define multiple indexers, each of which differs by the number and type of its arguments. Indexers are very similar to properties, especially in the syntax for defining them, as shown in the following code example:

public class Sparse2DArray {
   private double[][] _values;
   private int _i, _j;
   public Sparse2DArray(int i, int j) {
      _i = i;
      _j = j;
   }
   public double this[int i, int j] {
      set {
         // Check bounds
         if (i < 0 || i >= _i || j < 0 || j >= _j) {
            throw new IndexOutOfRangeException();
         }
         // Create values matrix if it doesn't
exist
         if (_values == null) {
            _values = new double[_i][];
         }
         // Create j'th array if it doesn't
exist
         if (_values[i] == null) {
            _values[i] = new double[_j];
         }
         _values[i][j] = value;
      }
      get {
          // Check bounds
         if (i < 0 || i >= _i || j < 0 || j >= _j) {
            throw new IndexOutOfRangeException();
         }
         // Sparse matrix is zero where no
values exist
         if (_values[i] == null || values[i][j] == 
         null) {
            return 0.0;
         } else {
            return _values[i][j];
         }
      }
   }
};

The indexer is defined with a syntax similar to the definition of a method, with the text: public double this[int i, int j]. The set block for the indexer (everything inside the block within the set { ... }) creates the private _values array, or any subarray of that array, as needed to store the value passed to the indexer. The get block (inside get { ... } is the value present at the given [i, j] position, or zero if the value does not exist. You can index instances of Sparse2DArray with the square bracket notation used for arrays, like so:

Sparse2DArray s2d = new Sparse2DArray(1000, 10);
s2d[512, 6] = 3.14159265;
Console.WriteLine(s2d[12,5]);  // will print
zero

Structs, enumerations, properties, delegates

C# defines several class member types that do not exist in Java: structs, enumerations, properties, delegates. This section describes the function of each type.

Structs

A struct is somewhat like a struct in C, except that a C# struct may have any kind of class member, including constructors and methods; and the default accessibility for struct members is private, rather than public as in C. Like C structs, though, C# structs always copy by value and are therefore both mutable and exempt from dynamic memory management (i.e., garbage collection). Variables that copy by value don't need garbage collection because the memory used to represent them disappears when those variables go out of scope. In many ways, a struct is similar to a class: a struct can implement interfaces and can have the kinds of members that classes can have. But structs can't inherit from other structs.

Interestingly, the scalar types like int and double are implemented in C# as aliases for predefined structs in the System namespace. In other words, when you define an int variable in C#, for example, you're actually defining an instance of System.Int32, which is a struct predefined in C# language. The struct System.Int32, in turn, inherits all of the members of System.Object. (While the language specification says structs can't inherit, it also says they do inherit from System.Object; presumably, this is an exception.) So, basically, every primitive type in C# is also an object, and therefore "object" in C# means something other than the traditional interpretation of "instance of a class."

Since primitives, classes, and structs inherit from System.Object, everything in C# is an object and can be treated as such. This is how C# represents primitives as objects: they already are objects. Yet instances of primitives and structs copy by value, while instances of class copy by reference. Conversion from value types to reference types and vice versa are the boxing and unboxing operations mentioned earlier.

Enums

Another class member type Java does not provide is the enum. While similar to enums in C, C# enums are based on an "underlying type," which can be any signed or unsigned integer type. Enumerations are derived from the built-in class System.Enum, and therefore every enum inherits all of that class's members.

For example, an enum based on an unsigned long integer can be defined as:

enum description: ulong {
   Good,
   Bad,
   Ugly
};

Enums are inherently type-safe and require explicit type casts when they are assigned to and from integer types, even the type from which the enum is derived.

A common idiom in Java for "faking" enums is to define public final ints in an interface and then inherit the interface, like this:

public interface Description  {
   public final int Good = 0,
   public final int Bad = 1,
   public final int Ugly = 2
};
public class Sheriff implements Description {
   protected int _description;
   public Sheriff() {
      _description = Good;
   }
}

While this idiom improves readability in Java, it doesn't provide type safety, nor can methods overload on two such "enums," because they're really integers. C# provides true enumerated types.

Properties

Directly accessing the public data members of objects is generally understood to be poor form. Direct access to a data member of an object inherently breaks data encapsulation and is a maintenance hazard. When a data member is removed from an existing class, any code that accesses that removed member will also need to be fixed. Further, code that accesses a class's public data members relies on a particular implementation of that class. The traditional solution to this problem is to provide "accessor methods," or "getters" and "setters," which provide access to information about the object. The value set retrieved by an accessor method is typically called a "property." For example, a block of text in a word processor might have properties like "foreground color," "background color," and so on. Instead of exposing public Color members, the object representing a block of text could provide getter and setter methods to those properties. This concept of using access methods to encapsulate internal virtual object state is a design pattern that spans object-oriented languages. It isn't limited to C# or Java.

C# has taken the concept of properties a step further by actually building accessor methods into the language semantics. An object property has a type, a set method, and a get method. The set and get methods of the property determine how the property's value is set and retrieved. For example, a TextBlock class might define its background color property in this way:

public class TextBlock {
   // Assume Color is an enum
   private Color _bgColor;
   private Color _fgColor;
   public Color backgroundColor {
      get {
         return _bgColor;
      }
      set {
         _bgColor = value;
      }
   //... and so on...
   }
}

(Notice in the preceding set block _bgColor is set to value, which is a keyword that, in this context, means the value of the property.) Some other object could set or get a TextBlock's backgroundColor property like this:

TextBlock tb;
if (tb.backgroundColor == Color.green) { //
"get" is called for comparison
   tb.backgroundColor = Color.red;  // "set" is
called
} else {
   tb.backgroundColor = Color.blue;  // "set"
is called
}

So, the syntax to access a property looks like a member but is implemented like a method. Either of the get and set methods is optional, providing a way of creating "read only" and "write only" properties.

Some would say that Java has properties, since the JavaBeans specification defines properties in terms of method naming conventions and the contents of JavaBean PropertyDescriptors. While JavaBeans properties do everything C# properties do, JavaBeans properties are not built into the Java language, and so the syntax for using them is a method call:

1 2 3 Page 1