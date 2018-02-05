While learning Java, you'll occasionally encounter a language behavior that leaves you puzzled. For example, what does expression
new int[10] instanceof Object returning
true signify about arrays? In this post, I'll examine some of Java's language oddities.
Arrays are objects
A long time ago, while writing about message formatters, I encountered something strange in Java's
java.text.MessageFormat standard library class. Consider the following pair of formatting methods:
StringBuffer format(Object[] arguments, StringBuffer result, FieldPosition pos)
StringBuffer format(Object arguments, StringBuffer result, FieldPosition pos)
According to the Javadoc, either method formats an array of objects. Wait a minute! How can you pass an array of objects to
Object arguments? Is this a Javadoc misprint? The answer is no: you can pass an array of objects to this parameter.
The Java Language Specification explains this oddity. Section 10.1. Array Types states (in the fine print) that
Object is also a supertype of all array types. Hence, each of the following lines of code will output
true:
System.out.println(new int[10] instanceof Object);
System.out.println(new String[] { "A", "B" } instanceof Object);
I've created an
ArraysAreObjects application that demonstrates arrays being objects. Listing 1 presents the application's source code.
Listing 1.
ArraysAreObjects.java (version 1)
public class ArraysAreObjects
{
public static void main(String[] args)
{
print(new String[] { "A", "B", "C" });
print("Hello");
print(new int[] { 1, 2, 3 });
print(new Integer[] { 1, 2, 3 });
}
static void print(Object objects)
{
if (objects instanceof Object[])
for (Object object: (Object[]) objects)
System.out.println(object);
else
System.out.printf("[%s]%n", objects);
System.out.println();
}
}
ArraysAreObjects declares a
print() method that prints an object or an array of objects. It differentiates between these cases via
objects instanceof Object[], which returns
true when
objects references an array of objects.
Compile Listing 1 as follows:
javac ArraysAreObjects.java
Run the resulting application as follows:
java ArraysAreObjects
You should observe the following output (with a different hash code):
A
B
C
[Hello]
[[I@42d3bd8b]
1
2
3
Perhaps you're surprised to see something like
[[I@42d3bd8b] instead of each integer on a separate line when executing
print(new int[] { 1, 2, 3 });. Section 4.10.3. Subtyping among Array Types provides an answer:
The following rules define the direct supertype relation among array types:
If S and T are both reference types, then S[] >1 T[] iff S >1 T.
Object >1 Object[]
Cloneable >1 Object[]
java.io.Serializable >1 Object[]
If P is a primitive type, then:
Object >1 P[]
Cloneable >1 P[]
java.io.Serializable >1 P[]
Essentially, this section tells us that
Object and not
Object[] is the supertype of a primitive array type
This information helps to explain why
MessageFormat has two
format() methods that differ only in the type of the first parameter:
Object[] or
Object. The
format() method with
Object[] as its first parameter is called for reference array type arguments (e.g.,
new String[] { "A", "B" }), whereas the other
format() method is called for primitive array type arguments, as in
format(new int[] { 1, 2, 3 }, sb, pos).
Never write code like that shown in Listing 1. Instead, use Java's variable arguments (varargs) language feature (introduced in Java 5 long after Java 1.1's debut of
MessageFormat) to achieve more concise code. Consider Listing 2.
Listing 2.
ArraysAreObjects.java (version 2)
public class ArraysAreObjects
{
public static void main(String[] args)
{
print("A", "B", "C");
print("Hello");
print(1, 2, 3);
}
static void print(Object... objects)
{
for (Object object: objects)
System.out.println(object);
System.out.println();
}
}
Although this code is straightforward, you might be curious about
print(1, 2, 3);. The compiler generates code to autobox each integer into an
Integer object. These objects are stored in an
Object[] array that's passed to
print().
When you run this application, you should observe the following output:
A
B
C
Hello
1
2
3
The
java.util package's
Arrays and
Objects classes also demonstrate the impact of arrays being objects.
Arrays declares a
boolean deepEquals(Object[] a1, Object[] a2) method to determine whether two arrays are deeply equal (defined in that method's Javadoc). Similarly,
Objects declares
boolean deepEquals(Object a, Object b) to determine whether two nonarray or array objects are deeply equal.
You don't have to use
Objects.deepEquals() to compare a pair of nonarray objects. Instead, you could create a pair of arrays to hold these objects and pass these arrays to
Arrays.deepEquals(). But isn't that a code smell?
In case you're wondering how primitive array types are handled, note that
Objects.deepEquals() and
Arrays.deepEquals() delegate to
Arrays.deepEquals0(). Here's that method's source code:
static boolean deepEquals0(Object e1, Object e2)
{
assert e1 != null;
boolean eq;
if (e1 instanceof Object[] && e2 instanceof Object[])
eq = deepEquals ((Object[]) e1, (Object[]) e2);
else if (e1 instanceof byte[] && e2 instanceof byte[])
eq = equals((byte[]) e1, (byte[]) e2);
else if (e1 instanceof short[] && e2 instanceof short[])
eq = equals((short[]) e1, (short[]) e2);
else if (e1 instanceof int[] && e2 instanceof int[])
eq = equals((int[]) e1, (int[]) e2);
else if (e1 instanceof long[] && e2 instanceof long[])
eq = equals((long[]) e1, (long[]) e2);
else if (e1 instanceof char[] && e2 instanceof char[])
eq = equals((char[]) e1, (char[]) e2);
else if (e1 instanceof float[] && e2 instanceof float[])
eq = equals((float[]) e1, (float[]) e2);
else if (e1 instanceof double[] && e2 instanceof double[])
eq = equals((double[]) e1, (double[]) e2);
else if (e1 instanceof boolean[] && e2 instanceof boolean[])
eq = equals((boolean[]) e1, (boolean[]) e2);
else
eq = e1.equals(e2);
return eq;
}
As you can see, each primitive array type is handled as a special case.
Bytes and shorts are second-class citizens
According to Section 4.2. Primitive Types and Values in the Java Language Specification, Java supports five integral types: byte integer, short integer, integer, long integer, and character. These primitive types are represented via keywords
byte,
short,
int,
long, and
char, respectively. Each of the
byte,
short,
int, and
long types represents a signed integer. In contrast,
char represents an unsigned UTF-16 code unit.
Consider
byte,
short,
int, and
long. Each type differs only in its range of values based on the number of bits associated with the type: 8 (
byte), 16 (
short), 32 (
int), or 64 (
long). Because
byte and
short have smaller ranges (-128 through 127 for
byte and -32768 through 32767 for
short), the Java virtual machine (JVM) was designed with limited support for these types (which saved a few instructions).
The JVM provides various
int-only instructions (e.g.,
iadd,
isub, and
imul). Similarly, the JVM provides various
long-only instructions (e.g.,
ladd,
ldiv, and
lneg). In contrast,
byte and
short don't merit similar instructions.
The JVM does provide the following instructions to support
byte and
short:
bipush: Sign-extend 8-bit byte integer operand to 32-bit integer and push the result onto the operand stack.
i2b: Pop the 32-bit integer from the top of the operand stack, truncate this value to an 8-bit byte integer, sign-extend the result to a 32-bit integer, and push the result onto the operand stack.
i2s: Pop the 32-bit integer from the top of the operand stack, truncate this value to a 16-bit short integer, sign-extend the result to a 32-bit integer, and push the result onto the operand stack.
sipush: Sign-extend 16-bit short integer operand to 32-bit integer and push the result onto the operand stack.
The Java language reflects this second-class support for
byte and
short by not supporting
byte or
short integer literals. An integer literal is either of type
int (with no suffix) or of type
long (with the
l or
L suffix). However, it does provide one convenience: when assigning an
int literal to a
byte or a
short variable, you don't have to specify a cast operator when the literal ranges from -128 through 127 (
byte) or -32768 through 32767 (
short). For example, you can specify
byte b = 27; instead of having to specify
byte b = (byte) 27;. Similarly, you can specify
short s = 299; instead of having to specify
short s = (short) 299;.
It's easier to understand this second-class citizen business when you examine the bytecode to a simple application. Consider Listing 3.
Listing 3.
BytesAndShorts.java (version 1)
public class BytesAndShorts
{
public static void main(String[] args)
{
byte b = 27;
short s = 299;
}
}
Assuming that you've compiled this listing to
BytesAndShorts.class, execute the following command to obtain a disassembly:
javap -v BytesAndShorts
The following is that portion of the disassembly that's relevant to the
main() method:
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: (0x0009) ACC_PUBLIC, ACC_STATIC
Code:
stack=1, locals=3, args_size=1
0: bipush 27
2: istore_1
3: sipush 299
6: istore_2
7: return
There are three local variables: 0 (
args), 1 (
b), and 2 (
s).
At the source code level,
27 is a 32-bit integer literal. For efficiency,
27 is stored as an 8-bit byte following the operation code (opcode) for the
bipush instruction. As stated earlier, this instruction sign-extends this 8-bit value to a 32-bit value that's stored on the operand stack. This value will be popped off the stack and stored in local variable 1 (via
istore_1) -- recall that 1 refers to
b in the source code.
Here is something interesting: the
istore_1 instruction reveals that
byte variable
b is really of type
int at the JVM level. After all, the
istore instructions store 32-bit values.
Continuing with the disassembly,
sipush 299 sign-extends
299 to a 32-bit value that's stored on the operand stack, and the subsequent
istore_2 instruction stores this 32-bit value in
int variable
s.
It appears that the JVM does not recognize
byte or
short variables, but treats them as if they are of type
int. Listing 4 presents an application that probes deeper into this situation.
Listing 4.
BytesAndShorts.java (version 2)
public class BytesAndShorts
{
public static void main(String[] args)
{
int i = 35;
byte b = (byte) i;
short s = (byte) i;
}
}
Assuming that you've compiled this listing to
BytesAndShorts.class, execute the following command to obtain a disassembly:
javap -v BytesAndShorts
The following is that portion of the disassembly that's relevant to the
main() method:
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: (0x0009) ACC_PUBLIC, ACC_STATIC
Code:
stack=1, locals=4, args_size=1
0: bipush 35
2: istore_1
3: iload_1
4: i2b
5: istore_2
6: iload_1
7: i2b
8: i2s
9: istore_3
10: return
There are four local variables: 0 (
args), 1 (
i), 2 (
b), and 3 (
s).
The first two instructions convert
35 to a 32-bit integer and store it in
int variable
i. There are no surprises here. In contrast, the next three instructions retrieve this value, convert it to a
byte (via
i2b), and store the result in "
int" variable
b. Even though the JVM doesn't regard
b to be of type
byte, it still treats this variable as if it were a
byte:
i2b ensures that the 32-bit integer value won't lie outside the range -128 through 127.
The instruction sequence from offset 6 through offset 9 is interesting. I could have specified
short s = (short) i; instead of
short s = (byte) i; in the source code, but chose to deviate in order to see what happens at the JVM level. The
i2b instruction at offset 7 first converts the 32-bit integer value stored in
i to an 8-bit byte. The subsequent
i2s instruction converts this result to a 16-bit short integer, which is then sign-extended to a 32-bit integer in preparation for being stored in
s via
istore_3. The bytecode sequence for
short s = (byte) i; ensures that the value stored in "
int" variable
s doesn't lie outside the range -32768 through 32767 (and shows that you should avoid useless casts).
Private fields and methods are accessible without reflection
Under certain circumstances, you can access an object's
private field or call its
private method without having to use Java's Reflection API. Consider Listing 5.