Create and print multilingual PDF documents on the client

Use FOP to convert XML to a PDF

The use of intranets has enabled employees operating in offices spread across the globe to access a centralized application at their company headquarters. The ability to create and print documents over their company's intranet would prove to be valuable functionality for such employees.

With such functionality in place, what would happen if a user (say in Japan or India) wants to print hundreds of PDF documents using a default printer from a centralized application that runs on servers in the U.S. Opening each PDF with Acrobat Reader and printing all the documents individually would not only be time consuming, but also tedious. A solution that fires multiple PDFs to the default printer so the user doesn't need to open Acrobat Reader would prove most helpful.

Formatting Objects Processor (FOP) is an open source Java API that can convert your XML data directly into reports in PDF format. The software is developed under the Apache XML Graphics Project and is free to use. This article shows you how to convert raw data (from Oracle Database) to XML, which, in turn, is converted to reports in PDF format. It is assumed the user has a basic understanding of technologies like Java, XML, BEA WebLogic Server, Enterprise JavaBeans (EJB), FOP, Extensible Stylesheet Language Transformations (XSLT), and Oracle Database.

The figure below gives an overview of the flow of data. The XML data along with the XSL file forms the input to the FOP processor, which in turn is configured using the userconfig.xml for Unicode fonts.

Overview of Unicode XML to PDF document

Let's assume that the application is an n-tier application, uses a thin client (written with JavaServer Pages (JSP) and Struts), WebLogic, and Oracle Database. The data retrieved from the database is massaged by the beans on WebLogic by EJB components and servlets, and thrown to the client via JSP pages and Struts.

In our case, we need to process a request from the client and send data back to the client as a PDF file, which, in turn, is queued to the local default printer. In an intranet setup like ours, it's best to use the HTTP protocol and set up communication with a servlet to do the necessary work. A servlet, in general, can return any type of data, not just an HTML page. Since we have a thin client, we must use an applet on the client that has code within it to call this servlet.

Let's give the names PrintApplet and PrintServlet to our applet and servlet, respectively. The applet needs to be signed so that it's trusted and has no problems when users within the intranet access the page containing it. Also, we know that this applet will make HTTP calls to the PrintServlet on the application server.

Let's see what code needs to go within the applet. The flow of the PrintApplet is shown below. Let's assume the user has a search result set displayed on the browser page and wants to print a PDF of one of the records. The user selects a record and clicks the Print button.

PrintApplet

 

PrintApplet.java code:

/** * Import all FOP classes. Check http://xml.apache.org/fop/ for details. */ import org.xml.sax.InputSource; import org.apache.fop.apps.Driver; import org.apache.fop.apps.XSLTInputHandler; import org.apache.fop.apps.TraxInputHandler; import org.apache.fop.messaging.MessageHandler; import org.apache.avalon.framework.logger.Logger; import org.apache.avalon.framework.logger.ConsoleLogger;

public class PrintApplet extends JApplet {

org.apache.avalon.framework.logger.Logger log = null; private PrinterJob pj = null;

//Initialize the applet. public void init() {

pj = PrinterJob.getPrinterJob(); pj.setCopies(1); Locale local = this.getLocale();

}

public void paint(Graphics g) { }

//Get Applet information. public String getAppletInfo() { return "Applet Information"; } //Get parameter info. public String[][] getParameterInfo() { return null; }

public void callFromHTML(Object argument1) { try {

URL url = new URL("http://myserver:7011/web/PDFServlet?" +"QString1" + "=" + URLEncoder.encode(argument1"));

URLConnection conn = url.openConnection(); conn.setDoInput(true); conn.setDoOutput(true); conn.setUseCaches(false); conn.setRequestProperty("Content-type", "application/x-www-form-urlencoded"); PrintWriter toServlet = new PrintWriter(conn.getOutputStream()); toServlet.flush(); toServlet.close();

/** * In the following 2 code lines, FOP is set to use userconfig.xml file for its configuration. * */ URL userConfig = new URL("http","myserver","7011","/web/userconfig.xml"); org.apache.fop.apps.Options options = new org.apache.fop.apps.Options(userConfig.openStream());

org.apache.fop.apps.XSLTInputHandler input = new org.apache.fop.apps.XSLTInputHandler( "http://myserver:7011/web/myXML.xml","http://myserver:7011/web/myXSL.xsl" );

PrintRenderer renderer = new PrintRenderer(pj); Driver driver = new Driver(); driver.setLogger (log); driver.setRenderer (renderer); driver.render(input.getParser(), input.getInputSource()); } catch (Exception e) { e.printStackTrace(); } }

this.getAppletContext().showStatus (" Documents have been printed.");

}

/** * The PrintRenderer class is obtained from FopPrintServlet.java * example in FOP jar file. Get the complete class PrintRenderer here. * **/ class PrintRenderer extends AWTRenderer {

. . .

PrintRenderer(PrinterJob printerJob) { super(null);

this.printerJob = printerJob; startNumber = 0 ; endNumber = -1;

printerJob.setPageable(this);

mode = EVEN_AND_ALL; String str = System.getProperty("even"); if (str != null) { try { mode = Boolean.valueOf(str).booleanValue() ? EVEN : ODD; } catch (Exception e) {}

} } . . . . } // Class PrintRenderer.

}

The following steps briefly explain the creation of PDF files on the server and then how to print them on the client-side printer.

  1. Note that the applet methods can be called from within an HTML page, which embeds the applet. Once the user hits Print, a JavaScript function is called, which calls the PrintApplet's callFromHTML() method.

    Sample JavaScript that calls the applet method callFromHTML() is shown below:

     function callApplet(recordKey) 
    { 
       document. applets["PrintApplet"]. callFromHTML (recordKey);
    }
    
    
  2. In the PrintApplet code above, when the applet is initialized, it receives a reference to the PrinterJob object in the init() method. This object provides the handle to the default printer set up on the client machine.
  3. In the callFromHTML() method, the first step is to make a URL connection to the PrintServlet, which is passed the record key. The servlet, in turn, gets the record details by talking to the necessary beans and creates the equivalent XML file. Note that the XML is created using weblogic.xml.stream.* classes, since the Web application in this example uses WebLogic. An XMLOutputStream is created and started, and end and character elements are added to the output stream using the ElementFactory class:

     iXMLOutputStream =  factory.newOutputStream(new OutputStreamWriter(new                         FileOutputStream(XMLFileName,"UTF-8"));
    iXMLOutputStream.add(ElementFactory.createStartElement(aElement));
    iXMLOutputStream.add(ElementFactory.createEndElement(aElement));
    iXMLOutputStream.add(ElementFactory.createCharacterData(aCData));
    
    

    The output stream is encoded as UTF-8, which handles special character sets like French and Japanese.

  4. Once the XML document has been created on the server, we must pass this XML to the XSLTInputHandler class to create the XSL:FO document, which is input to the FOP driver. Note that the XSL file can also reside on the server. The XSLTInputHandler can take string URLs as its constructor arguments, thus, the XML and XSL files on the server can be accessed from within the applet using the HTTP protocol.
  5. The final step is to pass the renderer and the XSL:FO information from XSLTInputHandler to the FOP driver. The renderer, in this case, is the default client printer, and the XSL:FO source is obtained from the XSLTHandler class.

Note that the XML file created on the server is basically useless once the PDF has been created or printed. If a number of PDFs are requested by many clients each day, then the XML might take up some storage space on the server, which is undesirable. To avoid this situation, create the XML files in the /tmp directory on the Unix box that runs the server. This /tmp directory is cleaned at the end of each day, so the XML files are periodically deleted. For multiple PDF documents, the applet method must be called multiple times from the JavaScript with unique record keys.

Internationalizing the PDFs

So far we have discussed how we can create English PDFs on the fly. Let's now add a multilingual dimension to our discussion. Let's assume the users are based in Japan and want to have their invoice PDF labels in Japanese (the same logic can be applied to dynamic data coming from the database). Let's assume for simplicity's sake that the invoice PDF has only two labels: "Customer name" and "Customer phone." Let's say the invoice data table is named "Invoice." First, we need a column in the Invoice table that stores language information for that invoice record. In addition, a table called "Template" stores the Japanese equivalents of the labels.

These steps explain in detail how to handle Unicode fonts so they display correctly on the PDF document:

  1. Before creation of the XML file, as described above in Steps 3 and 4, the bean determines the language of the XML data. In our case, it's Japanese as deciphered from the Invoice record language column.
  2. The Template record for the Japanese language is retrieved, and the XML file is created with Japanese equivalents of the two labels.

At this point, we have finished creating our XML file with Japanese characters. Now we need to configure FOP to correctly render the double-byte (Unicode) data. There are basically four steps involved in this task:

  1. Install the Unicode TrueType font: In our example, let's use the MSGothic.ttf font-set, which is widely used to display Japanese Kanji characters. On a Windows NT system, go to the C:\WINNT\FONTS directory. From the File menu, select Install New Font to install the MSGothic font. In our example, we have a Unix server running WebLogic, and we place the msgothic.ttf file in the same directory as the msgothic.xml and userconfig.xml font-metrics files (explained in the following steps).
  2. Generate a TrueType font-metrics file: FOP has a built-in program TTFReader that can be used to create the font-metrics file. Before you run the TTFReader, make sure that the classpath is set correctly. To create the font-metrics file, run the TTFReader from the command line prompt as follows: java org.apache.fop.fonts.apps.TTFReader C:\windows\fonts\ msgothic.ttf msgothic.xml
  3. Register the Unicode font with FOP: Both the font file and the font-metrics file must be registered with FOP so that FOP is aware of them. This is done in the userconfig.xml file, which provides a template for configuring FOP. Update the <fonts> section of the userconfig.xml file. In the example below, "myserver" is the host name, and 7011 is the listening port. The font-metrics and font files are accessed using the HTTP protocol, an important point, since the userconfig.xml file is accessed from the client in an applet. While the userconfig.xml file loads, it can be accessed with the HTTP protocol. If you hardcode the paths using the file:/// protocol or as a directory like c://mydirectory//msgothic.xml, then FOP will give an exception saying "File not found." The exception results because these files are present on the server, and, within the applet, if you use the file protocol, FOP tries to get the file from the client.

     

    <fonts>

    <font metrics-file="http://myserver:7011/web/msgothic.xml" embed-file="http://myserver:7011/web/msgothic.TTF" kerning="yes"> <font-triplet name="MSGothic" style="normal" weight="normal"/> <font-triplet name="MSGothic" style="normal" weight="bold"/> </font>

    </fonts>

  4. Change the XSL file: In the XSL file, use the font-family attribute to specify the MSGothic font at the fo:block level. For example:

     <fo:block font-family="MSGothic"> 
    

In the above code samples, we used a couple of files on the application server that are accessed from the client. We can use the application server virtual-directory-mapping feature available for Web applications to store static files like msgothic.ttf in a different root than the document root of the Web application. In the case of WebLogic running on Unix, we map the files to their roots as follows:

 

<virtual-directory-mapping> <local-path>/opt/bea/my_projects/project1</local-path> <url-pattern>userconfig.xml</url-pattern> </virtual-directory-mapping>

<virtual-directory-mapping> <local-path>/opt/bea/my_projects/project1</local-path> <url-pattern>msgothic.xml</url-pattern> </virtual-directory-mapping>

<virtual-directory-mapping> <local-path>/opt/bea/my_projects/project1</local-path> <url-pattern>msgothic.ttf</url-pattern> </virtual-directory-mapping>

<virtual-directory-mapping> <local-path>/tmp</local-path> <url-pattern>*.xml</url-pattern> </virtual-directory-mapping>

<virtual-directory-mapping> <local-path>/opt/bea/my_projects/project1</local-path> <url-pattern>*.xsl</url-pattern> </virtual-directory-mapping>

Conclusion

You can use XSL and FOP to create and print multilingual PDF documents on thin clients. This article's solution provides for cases where client machines lack the foreign fonts to print PDF documents on the default printer. Any changes to the PDF format require changes to the XSL file only. No changes are required for the framework itself.

I would like to thank Mark Quinto, Mark Blasko, everyone working on the C2C project at Sony Pictures Entertainment, and everyone on the Apache FOP user mailing list.

For the past seven years, Manoj Nair has been working as a consultant for Sony Pictures Entertainment. During his 13 years as an IT consultant, he has been working with Java and related technologies for the past eight years and specializes in developing n-tier Web applications.

Learn more about this topic

Join the discussion
Be the first to comment on this article. Our Commenting Policies
See more