Java Tip 19: Java makes it easy to copy files from a Web site

Take the weight off Web browsing with this simple copy utility

Technology is supposed to make our lives easier. Well, I don't know about you, but I certainly find Web browsing a huge time-consuming endeavor. Sure, browsing is much easier now that I have a T1 connection, but raw bandwidth really is not the only issue involved. Many sites that I repeatedly graze are under a heavy hit load and respond quite slowly.

Various commercial entities are touting "offline" Web browsers that will slurp down entire Web sites to your local disk at predetermined times. You can then read those pages without being connected to the 'Net (and without having to pay any connect charges). Being the hacker that I am, instead of using one of these offline browsers, I prefer to learn to do that sort of work myself -- and incoporate it into my bag of tricks.

What does all this have to do with Java? Well, Anil Hemrajani (see Resources) has written a simple copy utility as a Java application that will copy files from a Web site when given a URL. So, while it easily copies files from sites, you can be off doing other things -- like writing your own Java Tip. Here's the source:

////////////////////////////////////////////////////////////////////////////
// Program: copyURL.java
// Author: Anil Hemrajani (anil@patriot.net)
// Purpose: Utility for copying files from the Internet to local disk
// Example: 1. java copyURL http://www.patriot.net/users/anil/resume/resume.gif
//          2. java copyURL http://www.ibm.com/index.html abcd.html
////////////////////////////////////////////////////////////////////////////
import java.net.*;
import java.io.*;
import java.util.Date;
import java.util.StringTokenizer;
class copyURL
{
  public static void main(String args[])
  {
      if (args.length < 1)
      {
          System.err.println
               ("usage: java copyURL URL [LocalFile]");
          System.exit(1);
      }
      try
      {
          URL           url  = new URL(args[0]);
          System.out.println("Opening connection to " + args[0] + "...");
          URLConnection urlC = url.openConnection();
          // Copy resource to local file, use remote file
          // if no local file name specified
          InputStream is = url.openStream();
          // Print info about resource
          System.out.print("Copying resource (type: " +
                           urlC.getContentType());
          Date date=new Date(urlC.getLastModified());
          System.out.println(", modified on: " +
             date.toLocaleString() + ")...");
          System.out.flush();
          FileOutputStream fos=null;
          if (args.length < 2)
          {
              String localFile=null;
              // Get only file name
              StringTokenizer st=new StringTokenizer(url.getFile(), "/");
              while (st.hasMoreTokens())
                     localFile=st.nextToken();
              fos = new FileOutputStream(localFile);
          }
          else
              fos = new FileOutputStream(args[1]);
          int oneChar, count=0;
          while ((oneChar=is.read()) != -1)
          {
             fos.write(oneChar);
             count++;
          }
          is.close();
          fos.close();
          System.out.println(count + " byte(s) copied");
      }
      catch (MalformedURLException e)
      { System.err.println(e.toString()); }
      catch (IOException e)
      { System.err.println(e.toString()); }
  }
}

As you can see, the code is pretty straightforward. Basically, it builds on the capabilities of the standard URL classes. Adding more capabilities is left for the reader to do as an exercise.

Please respect the fact that the content of most Web sites is copyrighted by and was created through the hard work of other people, and act responsibly.

Learn more about this topic

  • Anil Hemrajani's Web site is located at http://adams.patriot.net/~anil/