|
Search this site
|
Character classes revisited: named classesRecall that character classes are the 'choices' to match against a single character that we put in square brackets. For example, to match any digit, we have been using [0-9]. We've always "spelled out" the characters or range of characters in this way. For certain common character choices (and some uncommon ones) there's actually an easier option. We can put a backslash followed by a character class name. For example, to match a single digit, we can write the expression \d. So we can now write our 'has ten characters' method as follows:
public boolean containsTenDigits(String str) {
return str.matches(".*\\d{10}.*");
}
Notice that when we want to put a backslash inside a regular expression, we have to put a double backslash. This is because the slash already has a meaning inside Java strings (allowing us to write so-called escape sequences such as \n for a newline). Matching whitespaceAnother useful character class is \s. This matches so-called whitespace: spaces, tabs and line breaks (strictly speaking either ASCII character 10– the newline character– and character 13– the carriage return). ASCII characters 11 and 12 also count as whitespace, but in practise these are extremely rare nowadays. Again, to write \s inside a string literal, we need to double the backslash: "\\s". Named groupsVarious character classes can be formed from named groups, which are formed with the expression \p{name}, where name is one of a number of possible group names. Here are some of the most useful groups:
Written by Neil Coffey. Copyright © Javamex UK 2008. All rights reserved. |