Alternatives in capturing groups

Recall the pipe operator which is used to give a list of alternative expressions. For example, the following expression matches any three-letter month name:

jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec

When used inside a group, the pipe separates options that match the group. So to match a date of the format 10 jun 1998 etc, we can use the following expression:

([0-9]{2}).(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec).([0-9]{4})

(Note here we use the dot to match any character between the elements of the date.)

Optional groups

Using the ? operator, an entire group can be made optional. For example, the following expression will capture either one or two digits:

Pattern p = Pattern.compile("([0-9])([0-9])?");
Matcher m = p.matcher(str);
if (m.matches()) {
  String digit1 = m.group(1);
  String digit2 = m.group(2);
}

If str consists of two digits, then they will be captured as digit1 and digit2 respectively. If the string only contains one digit, then it will still match, and digit1 will contain that digit. But digit2 will be null.

In general, optional groups will contain null if not present in the string being matched.

Groups for the sake of pattern organisation

From the above, it may well have occurred to you that a group can be used simply to organise a regular expression. For example, if we want to make a particular sub-part optional, or to apply a choice to a particular part of the expression.

If grouping is purely to organise the expression, then it is possible to use what is called a non-capturing group which avoids the overhead of capturing.


If you enjoy this Java programming article, please share with friends and colleagues. Follow the author on Twitter for the latest news and rants.

Editorial page content written by Neil Coffey. Copyright © Javamex UK 2021. All rights reserved.