Optimize with a SATA RAID Storage Solution
Range of capacities as low as $1250 per TB. Ideal if you currently rely on servers/disks/JBODs
But what if you need a regular expression library now? Luckily, you can download the open source Jakarta ORO library from Apache.org. In this article, I'll first give you a short primer on regular expressions, and then I'll show you how to use regular expressions with the open source Jakarta-ORO API.
Let's start simple. Suppose you want to search for a string with the word "cat" in it; your regular expression would simply be "cat". If your search is case-insensitive, the words "catalog", "Catherine", or "sophisticated" would also match:
Regular expression: cat
Matches: cat, catalog, Catherine, sophisticated
Imagine you are playing Scrabble and need a three-letter word starting with the letter "t" and ending with the letter "n". Imagine also that you have an English dictionary and will search through its entire contents for a match using a regular expression. To form such a regular expression, you would use a wildcard notation -- the period (.) character. The regular expression would then be "t.n" and would match "tan", "Ten", "tin", and "ton"; it would also match "t#n", "tpn", and even "t n", as well as many other nonsensical words. This is because the period character matches everything, including the space, the tab character, and even line breaks:
Regular expression: t.n
Matches: tan, Ten, tin, ton, t n, t#n, tpn, etc.
To solve the problem of the period's indiscriminate matches, you can specify characters you consider meaningful with the bracket ("") expression, so that only those characters would match the regular expression. Thus, "t[aeio]n" would just match "tan", "Ten", "tin", and "ton". "Toon" would not match because you can only match a single character within the bracket notation:
Regular expression: t[aeio]n
Matches: tan, Ten, tin, ton
If you want to match "toon" in addition to all the words matched in the previous section, you can use the "|" notation, which is basically an OR operator. To match "toon", use the regular expression "t(a|e|i|o|oo)n". You cannot use the bracket notation here because it will only match a single character. Instead, use parentheses -- "()". You can also use parentheses for groupings (more on that later):