Newsletter sign-up
View all newsletters

Sign up for our technology specific newsletters.

Enterprise Java
Email Address:

Regular expressions simplify pattern-matching code

Discover the elegance of regular expressions in text-processing scenarios that involve pattern matching

  • Digg
  • Reddit
  • SlashDot
  • Stumble
  • del.icio.us
  • Technorati
  • dzone

Page 3 of 16

Regex = .ox
Text = The quick brown fox jumps over the lazy ox.
Found fox
  starting at index 16 and ending at index 19
Found  ox
  starting at index 39 and ending at index 42


The output reveals two matches: fox and ox (with a leading space character). The . metacharacter matches the f in the first match and the space character in the second match.

What happens if we replace .ox with the period metacharacter? That is, what outputs when we specify java . "The quick brown fox jumps over the lazy ox."? Because the period metacharacter matches any character, RegexDemo outputs a match for each character in its text command-line argument, including the terminating period character.

Tip
To specify . or any metacharacter as a literal character in a regex construct, quote—convert from meta status to literal status—the metacharacter in one of two ways:
  • Precede the metacharacter with a backslash character.
  • Place the metacharacter between \Q and \E (e.g., \Q.\E).
In either scenario, don't forget to double each backslash character (as in \\. or \\Q.\\E) that appears in a string literal (e.g., String regex = "\\.";). Do not double the backslash character when it appears as part of a command-line argument.


Character classes

We sometimes limit those characters that produce matches to a specific set of characters. For example, we might search text for vowels a, e, i, o, and u, where any occurrence of any vowel indicates a match. A character class, a regex construct that identifies a set of characters between open and close square bracket metacharacters ([ ]), helps us accomplish that task. Pattern supports the following character classes: