Various commercial entities are touting "offline" Web browsers that will slurp down entire Web sites to your local disk at predetermined times. You can then read those pages without being connected to the 'Net (and without having to pay any connect charges). Being the hacker that I am, instead of using one of these offline browsers, I prefer to learn to do that sort of work myself -- and incoporate it into my bag of tricks.
What does all this have to do with Java? Well, Anil Hemrajani (see Resources) has written a simple copy utility as a Java application that will copy files from a Web site when given a URL. So, while it easily copies files from sites, you can be off doing other things -- like writing your own Java Tip. Here's the source:
////////////////////////////////////////////////////////////////////////////
// Program: copyURL.java
// Author: Anil Hemrajani (anil@patriot.net)
// Purpose: Utility for copying files from the Internet to local disk
// Example: 1. java copyURL http://www.patriot.net/users/anil/resume/resume.gif
// 2. java copyURL http://www.ibm.com/index.html abcd.html
////////////////////////////////////////////////////////////////////////////
import java.net.*;
import java.io.*;
import java.util.Date;
import java.util.StringTokenizer;
class copyURL
{
public static void main(String args[])
{
if (args.length < 1)
{
System.err.println
("usage: java copyURL URL [LocalFile]");
System.exit(1);
}
try
{
URL url = new URL(args[0]);
System.out.println("Opening connection to " + args[0] + "...");
URLConnection urlC = url.openConnection();
// Copy resource to local file, use remote file
// if no local file name specified
InputStream is = url.openStream();
// Print info about resource
System.out.print("Copying resource (type: " +
urlC.getContentType());
Date date=new Date(urlC.getLastModified());
System.out.println(", modified on: " +
date.toLocaleString() + ")...");
System.out.flush();
FileOutputStream fos=null;
if (args.length < 2)
{
String localFile=null;
// Get only file name
StringTokenizer st=new StringTokenizer(url.getFile(), "/");
while (st.hasMoreTokens())
localFile=st.nextToken();
fos = new FileOutputStream(localFile);
}
else
fos = new FileOutputStream(args[1]);
int oneChar, count=0;
while ((oneChar=is.read()) != -1)
{
fos.write(oneChar);
count++;
}
is.close();
fos.close();
System.out.println(count + " byte(s) copied");
}
catch (MalformedURLException e)
{ System.err.println(e.toString()); }
catch (IOException e)
{ System.err.println(e.toString()); }
}
}
As you can see, the code is pretty straightforward. Basically, it builds on the capabilities of the standard URL classes. Adding more capabilities is left for the reader to do as an exercise.
Great work. To Anonymous on June 16, 2009, 1:59 pm Its faster iBy Anonymous on February 27, 2010, 2:22 pmGreat work. To Anonymous on June 16, 2009, 1:59 pm Its faster indeed, but it returns corrupted data, try to copy pdf file.
Reply | Read entire comment
Thanks!By Anonymous on November 7, 2009, 12:09 pmI'm writing some code to grab files from a database index and analyze them for a school project, so this is just the amount of code I needed!
Reply | Read entire comment
InefficientBy Anonymous on June 16, 2009, 1:59 pmReading/writing one byte at a time is very slow. Blocking will speed things up a lot... byte[] buf = new byte[4096]; int size = 0; while((size = is.read(buf))...
Reply | Read entire comment
Gracias por la ayuda, me ha sido muy util tu explicacion, solo tuve que montar la autenticacion del proxy y acomodor dos cosas mBy Anonymous on April 27, 2009, 2:23 pmGracias por la ayuda, me ha sido muy util tu explicacion, solo tuve que montar la autenticacion del proxy y acomodor dos cosas mas y listo, excelente.
Reply | Read entire comment
View all comments