Java 101: Java's character and assorted string classes support text-processing

Explore Character, String, StringBuffer, and StringTokenizer

1 2 3 4 5 Page 3
Page 3 of 5
// CS.java
// Country search
import java.io.*;
class CS
{
   static String [] countries =
   {
      "Argentina",
      "Australia",
      "Bolivia",
      "Brazil",
      "Canada",
      "Chile",
      "China",
      "Denmark",
      "Egypt",
      "England",
      "France",
      "India",
      "Iran",
      "Ireland",
      "Iraq",
      "Israel",
      "Japan",
      "Jordan",
      "Pakistan",
      "Russia",
      "Scotland",
      "South Africa",
      "Sweden",
      "Syria",
      "United States"
   };
   public static void main (String [] args)
   {
      int i;
      if (args.length != 1)
      {
          System.err.println ("usage: java CS country-name");
          return;
      }
      String country = args [0];
      // First search attempt using == operator
      for (i = 0; i < countries.length; i++)
           if (country == countries [i])
           {
               System.out.println (country + " found");
               break;
           }
      if (i == countries.length)
          System.out.println (country + " not found");
      // Intern country string
      country = country.intern ();
      // Second search attempt using == operator
      for (i = 0; i < countries.length; i++)
           if (country == countries [i])
           {
               System.out.println (country + " found");
               break;
           }
      if (i == countries.length)
          System.out.println (country + " not found");
   }       
}

CS attempts twice to locate a specific country name in an array of country names with the == operator. The first attempt fails because the country name string literals end up as Strings in the common string memory pool, and the String containing the name being searched is not in that pool. After the first search attempt, country = country.intern (); interns that String in the pool; this second search most likely succeeds, depending on the name being searched. For example, java CS Argentina produces the following output:

Argentina not found
Argentina found

The StringBuffer class

String is not always the best choice for representing strings in a program. The reason: Its immutability causes String methods, such as substring(int beginIndex, int endIndex), to create new String objects, rather than modify the original String objects. In many situations, that leads to unreferenced Strings that become eligible for garbage collection. When many unreferenced Strings are created within a long loop, overall heap memory reduces, and the garbage collector might need to perform many collections, which can affect a program's performance, as the following code demonstrates:

String s = "abc";
String t = "def";
String u = "";
for (int i = 0; i < 100000; i++)
     u = u.concat (s).concat (t);

u.concat (s) creates a String containing the u-referenced String's characters followed by the s-referenced String's characters. The new String's reference subsequently returns and identifies a String, named a to prevent confusion, on which concat (t) is called. The concat (t) method call results in a new String object, b, that contains a's characters followed by the t-referenced String's characters. a is discarded (because its reference disappears) and b's reference assigns to u (which results in u becoming eligible for garbage collection).

During each loop iteration, two Strings are discarded. By the loop's end, assuming garbage collection has not occurred, 200,000 Strings that occupy around 2,000,000 bytes await garbage collection. If garbage collection occurs during the loop, this portion of a program's execution takes longer to complete. That could prove problematic if the above code must complete within a limited time period. The StringBuffer class solves this problem.

StringBuffer objects

In many ways, the java.lang.StringBuffer class resembles its String counterpart. For example, as with String, a StringBuffer object stores a character sequence in a character array that StringBuffer's private value field variable references. Also, StringBuffer's private count integer field variable records that array's character number. Finally, both classes declare a few same-named methods with identical signatures, such as public int indexOf(String str).

Unlike String objects, StringBuffer objects represent mutable, or changeable, strings. As a result, a StringBuffer method can modify a StringBuffer object. If the modification produces more characters than value can accommodate, the StringBuffer object automatically creates a new value array with double the capacity (plus two additional array elements) of the current value array, and copies all characters from the old array to the new array. (After all, Java arrays have a fixed size.) Capacity represents the maximum number of characters a StringBuffer's value array can store.

Create a StringBuffer object via any of the following constructors:

  • public StringBuffer() creates a new StringBuffer object that contains no characters but can contain up to 16 characters before automatically expanding. StringBuffer has an initial capacity of 16 characters.
  • public StringBuffer(int initCap) creates a new StringBuffer that contains no characters and up to initCap characters before automatically expanding. If initCap is negative, this constructor throws a NegativeArraySizeException object. StringBuffer has an initial capacity of initCap.
  • public StringBuffer(String str) creates a new StringBuffer that contains all characters in the str-referenced String and up to 16 additional characters before automatically expanding. StringBuffer's initial capacity is the length of str's string plus 16.

The following code fragment demonstrates all three constructors:

StringBuffer sb1 = new StringBuffer ();
StringBuffer sb2 = new StringBuffer (100);
StringBuffer sb3 = new StringBuffer ("JavaWorld");

StringBuffer sb1 = new StringBuffer (); creates a StringBuffer with no characters and an initial capacity of 16. StringBuffer sb2 = new StringBuffer (100); creates a StringBuffer with no characters and an initial capacity of 100. Finally, StringBuffer sb3 = new StringBuffer ("JavaWorld"); creates a StringBuffer containing JavaWorld and an initial capacity of 25.

StringBuffer method sampler

Since we already examined StringBuffer's constructor methods, we now examine the nonconstructor methods. For brevity, I focus on only 13 methods.

Note
Like String, many StringBuffer methods require an index argument for accessing a character in the StringBuffer's value array (or a character array argument). That index/offset is always zero-based.
  • public StringBuffer append(char c) appends c's character to the contents of the current StringBuffer's value array and returns a reference to the current StringBuffer. Example: StringBuffer sb = new StringBuffer ("abc"); sb.append ('d'); System.out.println (sb); (output: abcd).
  • public StringBuffer append(String str) appends the str-referenced String's characters to the contents of the current StringBuffer's value array and returns a reference to the current StringBuffer. Example: StringBuffer sb = new StringBuffer ("First,"); sb.append (" second"); System.out.println (sb); (output: First, second).
  • public int capacity() returns the current StringBuffer's current capacity (that is, value's length). Example: StringBuffer sb = new StringBuffer (); System.out.println (sb.capacity ()); (output: 16).
  • public char charAt(int index) extracts and returns the character at the index position in the current StringBuffer's value array. This method throws an IndexOutOfBoundsException object if index is negative, equals the string's length, or exceeds that length. Example: StringBuffer sb = new StringBuffer ("Test string"); for (int i = 0; i < sb.length (); i++) System.out.print (sb.charAt (i)); (output: Test string).
  • public StringBuffer deleteCharAt(int index) removes the character at the index position in the current StringBuffer's value array. If index is negative, equals the string's length, or exceeds that length, this method throws a StringIndexOutOfBoundsException object. Example: StringBuffer sb = new StringBuffer ("abc"); sb.deleteCharAt (1); System.out.println (sb); (output: ac).
  • public void ensureCapacity(int minimumCapacity) ensures the current StringBuffer's current capacity is larger than minimumCapacity and twice the current capacity. If minimumCapacity is negative, this method returns without doing anything. The following code demonstrates this method:

    StringBuffer sb = new StringBuffer ("abc"); 
    System.out.println (sb.capacity ()); 
    sb.ensureCapacity (20);
    System.out.println (sb.capacity ());
    

    The fragment produces the following output:

    19
    40
    
  • Tip
    Because it takes time for a StringBuffer to create a new character array and copy characters from the old array to the new array (during an expansion), use ensureCapacity(int minimumCapacity) to minimize expansions prior to entering a loop that appends many characters to a StringBuffer. That improves performance.
  • public StringBuffer insert(int offset, String str) inserts the str-referenced String's characters into the current StringBuffer beginning at the index that offset identifies. Any characters starting at offset move upwards. If str contains a null reference, the null character sequence is inserted into the StringBuffer. Example: StringBuffer sb = new StringBuffer ("ab"); sb.insert (1, "cd"); System.out.println (sb); (output: acdb).
  • public int length() returns the value stored in count. In other words, this method returns a string's length. If the string is empty, length() returns 0. A StringBuffer's length differs from its capacity; length specifies value's current character count, whereas capacity specifies the maximum number of characters that store in that array. Example: StringBuffer sb = new StringBuffer (); System.out.println (sb.length ()); (output: 0).
  • public StringBuffer replace(int start, int end, String str) replaces all characters in the current StringBuffer's value array that range between indexes start and one position less than end (inclusive) with characters from the str-referenced String. This method throws a StringIndexOutOfBoundsException object if start is negative, exceeds the value array's length, or is greater than end. Example: StringBuffer sb = new StringBuffer ("abcdef"); sb.replace (0, 3, "x"); System.out.println (sb); (output: xdef).
  • public StringBuffer reverse() reverses the character sequence in the current StringBuffer's value array. Example: StringBuffer sb = new StringBuffer ("reverse this"); System.out.println (sb.reverse ()); (output: siht esrever).
  • public void setCharAt(int index, char c) sets the character at position index in the current StringBuffer's value array to c's contents. If index is negative, equals value's length, or exceeds that length, this method throws an IndexOutOfBoundsException object. Example: StringBuffer sb = new StringBuffer ("abc"); sb.setCharAt (0, 'd'); System.out.println (sb); (output: dbc).
  • public void setLength(int newLength) establishes a new length for the current StringBuffer's value array. Every character in that array located at an index less than newLength remains unchanged. If newLength exceeds the current length, null characters append to the array beginning at the newLength index. If necessary, StringBuffer expands by creating a new value array of the appropriate length. This method throws an IndexOutOfBoundsException object if newLength is negative. The following fragment demonstrates this method:

    StringBuffer sb = new StringBuffer ("abc");
    System.out.println (sb.capacity ());
    System.out.println (sb.length ());
    sb.setLength (100);
    System.out.println (sb.capacity ());
    System.out.println (sb.length ());
    System.out.println ("[" + sb + "]");
    

    The fragment produces this output (in the last line, null characters, after abc, appear as spaces):

    19
    3
    100
    100
    [abc                                        ]
    
  • public String toString() creates a new String object containing the same characters as the current StringBuffer's value array and returns a reference to String. The following code demonstrates toString() in a more efficient (and faster) alternative to String's concat(String str) method for concatenating strings within a loop:

    String s = "abc";
    String t = "def";
          
    StringBuffer sb = new StringBuffer (2000000);
    for (int i = 0; i < 100000; i++)
         sb.append (s).append (t);
    String u = sb.toString ();
    sb = null;
    System.out.println (u);
    

    As the output is large, I don't include it here. Try converting this code into a program and compare its performance with the earlier String s = "abc"; String t = "def"; String u = ""; for (int i = 0; i < 100000; i++) u = u.concat (s).concat (t); code.

For a demonstration of StringBuffer's append(String str) and toString() methods, and the StringBuffer() constructor, examine Listing 4's DigitsToWords, which converts an integer value's digits to its equivalent spelled-out form (for example, 10 verses ten):

Listing 4: DigitsToWords.java

1 2 3 4 5 Page 3
Page 3 of 5