Write your own Twitter application

Archive your tweets with Apache Commons HttpClient, dom4j, and iText

That buzz you've been hearing is the sound of millions of Twitter updates -- tweets -- careening around cyberspace. Even cooler than the Twitter social-networking service itself is that fact that Twitter data is exposed in an open API that your applications can tap into. With iText creator Bruno Lowagie as your guide, find out how to leverage three open source Java libraries to archive tweets dynamically in a PDF document. You'll write standalone Java code, then integrate it into a servlet so that you can offer the application as a service to other Twitter users.

The Twitter social-networking and micro-blogging service has become immensely popular since it launched in 2006. Twitter users send tweets -- real-time, text-based updates of up to 140 characters -- and read other users' tweets via the Twitter Website, Short Message Service (SMS), Really Simple Syndication (RSS), or Twitter applications. Tweets are displayed on the user's profile page and delivered to other users who have signed up to receive them. In this article you'll learn how to build your own Twitter service: an application that accesses tweets via the Twitter API and archives them in the form of a PDF file.

You'll build your application with the help of three open source Java libraries:

  • HttpClient 3.x from the Apache Commons library. You'll use this API to obtain an XML stream of tweets from the Twitter API, as well as the Commons Logging and Codec components.
  • dom4j to parse the XML and extract specific data from each tweet.
  • iText to create the PDF document.

This is a hands-on tutorial, so download those libraries if you haven't already done so. I'll remind you which JARs you need from them as we go along. If you don't already have a Twitter account, set one up now and start using the service so that you have some tweets to work with.

You'll begin by writing your Twitter application as a standalone Java program. Eventually you'll integrate the code in a servlet, so that you can offer the application on your site as a service available to other Twitterati.

Getting started

Every Twitter application makes use of the Twitter API, which is well documented on the Twitter API wiki. Let's suppose you're just interested in read-only access for now: you want to retrieve tweets and visualize them in some way. You don't know yet that you're going to produce a PDF document; you only know that you're going to "consume" a number of tweets. That's why you start by writing an interface for such a consumer, as shown in Listing 1.

Listing 1. TweetConsumer.java

import org.dom4j.Element;
public interface TweetConsumer {
    public String tweet(Element element) throws TweetException;

The Twitter API can provide tweets in XML, JSON (JavaScript Object Notation), and RSS. For this example you choose to work with XML, and you'll use dom4j to parse the XML. (Don't forget to add the dom4j JAR to your classpath.) You're importing the org.dom4j.Element interface. This XML element will contain plenty of information: a date, some text, a user ID, and also a tweet ID. The tweet ID is the String you'll use as the return value for the tweet() method.

You throw a typed exception when something goes wrong. That's the TweetException class shown in Listing 2.

Listing 2. TweetException.java

public class TweetException extends Exception {
    private static final long serialVersionUID = 7577136074623618615L;
    public TweetException(Exception e) {

This is all standard stuff, but now comes the interesting part: the TweetProducer class.

1 2 3 4 5 Page 1
Page 1 of 5