Introduction to networking in Java

In this section, we will look at how to perform various networking operations in Java. Various aspects of networked I/O in Java are actually very similar to Java I/O generally.

Examle: how to download data from a URL

One of the most common networked operations that people want to perform in Java is to download data from a particular URL. This is generally a straightforward task. In its simplest form, the general procedure is as follows:

Constructing a URL object

We can construct a URL object simply by passing it the string representation of the URL, as would appear in a browser address bar:

try {
  URL ur = new URL("http://www.mydomain.com/myfile.gif");
  // do something with the URL...
} catch (IOException ioex) {
  ...
}

Notice that we catch IOException. Constructing a URL could throw a type of IOException, specifically MalformedURLException. Since we're likely to use the URL in order to connect to it— an operation that could also throw IOExceptions— it's often simpler to just catch any type of IOException around the whole operation.

Reading binary data from a URL

If the URL points to binary data, such as an image, then we essentially want to follow the above pattern, but pull out "raw bytes" from the input stream. If we want to get the bytes into a byte array, then we can use the help of ByteArrayOutputStream. This class lets us feed it successive bytes, then at the end call toByteArray(). So the code could look as follows:

  public static byte[] getBinaryURLContent(URL url) throws IOException {
    URLConnection conn = url.openConnection();
    InputStream in = new BufferedInputStream(conn.getInputStream());
    try {
      ByteArrayOutputStream bout = new ByteArrayOutputStream(10000);
      int b;
      while ((b = in.read()) != -1) {
        bout.write(b);
      }
      return bout.toByteArray();
    } finally {
      in.close();
    }
  }

Notice that:

Closing the URLConnection?

There's a special contract between the InputStream and the underlying URLConnection that closing one will close the other. So it's sufficient in this case to just close the InputStream.

Reading the contents of a URL as a string (or CharSequence)

How to download the content of a URL to a string is a common situation, and is not much different to the binary data case just examined. Essentially, we need to read character by character from the URL stream and append each character to a string (or in fact, a string buffer of some kind). As of Java 5, we can use a StringBuilder, which is a non-synchronized StringBuffer.

Apart from the destination of the characters, a key issue is character encoding: that is, the scheme by which bytes are "mapped" to characters. If we're really lucky, the server will tell us which encoding it uses, and we can read the name of the scheme with getContentEncoding(). However, we must be prepared for the possibility that this method will just return null, in which case we need to make an assumption of some kind. For simplicity, we'll just assume a default encoding of ISO-8859-1 (another common encoding scheme being UTF-8):

public static CharSequence getURLContent(URL url) throws IOException {
  URLConnection conn = url.openConnection();
  String encoding = conn.getContentEncoding();
  if (encoding == null) {
    encoding = "ISO-8859-1";
  }
  BufferedReader br = new BufferedReader(new
      InputStreamReader(conn.getInputStream(), encoding));
  StringBuilder sb = new StringBuilder(16384);
  try {
    String line;
    while ((line = br.readLine()) != null) {
      sb.append(line);
      sb.append('\n');
    }
  } finally {
    br.close();
  }
  return sb;
}

Note some other points:

In practice, I've also found readLine() to give slightly better performance: possibly because the JVM can compile this whole method and avoid a method call per character.