|
Search this site
|
Regular expression example: IP location (ctd)We now have a total of three expressions to extract the country code from Yahoo and Google referrer strings. To get the country code from a referrer string, we simply try matching the string against each pattern in turn. Since in each case the captured country code will be group 1, we can declare a single Matcher variable, which we successively instantiate with the next pattern on failure. The code looks something like this. Note that we write it in such a way as to avoid calling matches() more than once on the same matcher:
Pattern pGoogle1 =
Pattern.compile("(?:http://)?www\\.google\\.com/.*hl=([a-z]{2}).*");
Pattern pGoogle2 = Pattern.compile("(?:http://)?" +
"www\\.google(?:\\.com|\\.co)?\\.([a-z]{2})/.*");
Pattern pYahoo = Pattern.compile("(?:http://)?" +
"([a-z]{2})\\.search\\.yahoo\\.com/.*");
public String guessCountryCode(String referrer) {
Matcher m = pGoogle1.matcher(referrer);
if (!m.matches()) {
m = pGoogle2.matcher(referrer);
if (!m.matches()) {
m = pYahoo.matcher(referrer);
if (!m.matches()) {
return null;
}
}
}
String code = m.group(1).toUpperCase();
if ("UK".equals(code)) {
code = "GB";
}
return code;
}
Of course if we had a large number of Patterns (as well may happen in real life), we may well want to put them in an array and cycle through in a loop. Notice that at the end of this method we can put in any corrections necessary to turn the domain suffixes and/or language codes into standard country codes (e.g. the standard country code GB generally covers the UK). Written by Neil Coffey. Copyright © Javamex UK 2008. All rights reserved. |