Introduction to regular expressions in Java

Regular expressions are a bit like marmite: people either love them or hate them. Essentially, they are a syntax for pattern matching.

Pattern matching by hand-coding or with regular expressions

Suppose you want to answer the question: does a given string contain a series of 10 digits? You could hand-code this: cycle through the characters in the string until you hit a digit. Then when you find a digit, cycle through checking that the next nine characters are digits. So in Java, the code would look something like this:

public boolean hasTenDigits(String s) {
  int noDigitsInARow = 0;
  for (int len = s.length(), i = 0; i < len; i++) {
    char c = s.charAt(i);
    if (Character.isDigit(c)) {
      if (++noDigitsInARow == 10) {
        return true;
      }
    } else {
      noDigitsInARow = 0;
    }
  }
  return false;
}

The strengths and weaknesses of this code are obvious:

Doing the same thing with a regular expression looks something like this:

public boolean hasTenDigits(String s) {
  return s.matches(".*[0-9]{10}.*");
}

You'll probably agree that we've more or less reversed the above two points. Now, we have a nice succinct piece of code, but it does rely on you understanding a nasty piece of syntax. The argument in favour of regular expressions is:

On the next page, we'll get going with basic expressions with String.matches().

In case you already know something about regular expressions and want to skip ahead, here are some of the later topics currently covered by this tutorial:

Regular expression examples

Finally, we'll look at a couple of examples of using regular expressions: