Secure data files embedded in MIDP applications

Protect your MIDP applications from copyright theft

As OTA (over-the-air) technology continues to mature, a new market has opened to software developers. Today, savvy cell phone users can purchase a J2ME MIDP (Mobile Information Device Profile) application from and download it into their phones in seconds. In just couple of years, the wireless software industry has leaped from ring-tones to games to full-fledged applications.

Many consumers have yet to realize how much they can do with their phones; they often associate J2ME applications with frivolous games. However, as consumers upgrade to new generations of phones, a new breed of knowledge-based MIDP applications with practical uses are getting noticed. The list of Handango's best selling J2ME software is full of knowledge-based applications, such as multilingual dictionaries, vocabulary trainers, Bibles, trivia, and even cocktail recipes. A dictionary program listed less than a year ago has already received close to 26,000 downloads! Considering that thousands of wireless subscribers are willing to pay for a ring-tone, the market potential for practical J2ME applications is enormous.

Unlike games, the value of knowledge-based applications lies in the data. Due to the hardware-imposed simplicity of a MIDP application, if the data file is compromised, one can easily come up with a competing application. Although a client-server model, where the data resides in a backend server, can address this concern, such a model proves impractical for most wireless applications. Until the cost of wireless data comes down and wireless connections become more reliable and ubiquitous, embedding the data file into the MIDP JAR is the only option. Therefore, protecting the data file from copyright thefts becomes imperative.

A common solution to this issue is to encrypt the data file. In fact, several Java Specification Requests (JSRs) are underway to provide a standard set of APIs for encryption and decryption. However, cryptography computation is CPU intensive even for a desktop computer running J2SE. Not to mention that it will be a couple of years before all new phones support such APIs. Moreover, code based on the new APIs will not be backwards compatible with all current phones. This creates a gray period where an application may run on some phones, but fail on others. An alternative is to use third-party cryptography libraries supporting J2ME. One popular open source implementation is Bouncy Castle's lightweight crypto API. But, there is a catch: the library is more than 400 KB. Given that most Nokia Java phones in the market have a 64-KB JAR size limitation, this solution is again not viable.

I propose a simple solution to the problem. It adds one extra line of code, and the JAR size penalty is negligible. It has two phases. First, clean up and compress the data file. Second, obfuscate the compressed file so it can't be easily decompressed.

  • Clean up the data file: Many Windows-based editors escape new lines with \n\r. This is especially the case if you export the data from Excel. The "\r" is redundant since "\n" alone is sufficient to indicate a new line. Removing "\r" and unused spaces before the line break will save you a couple thousand bytes, depending on how many lines are in the data file.

    Note that not all text editors can detect "\r"; UltraEdit is one editor that works. If the data file is large, break it into smaller files to promote quick searches. But keep in mind that smaller text files don't compress as well as larger ones.

  • Compress the data files: The choice of a compression algorithm is limited by the availability of the J2ME compression implementation. Although the compression API is bundled with J2ME Connected Device Configuration (CDC), which is targeted for devices with at least 2 MB of memory, most current cell phone devices support only Connected Limited Device Configuration (CLDC) and the MIDP standard built on top it. The minimum memory requirement for CLDC is 128 KB. Therefore, developers must find third-party libraries or write their own. I found three:
    • JCraft's compression library uses the zlib compression algorithm. It is open source, well documented, and has a large following; however, the library's size is a bit large.
    • Java4Ever has a gzip (GNU zip) implementation (3.27 KB). It is under a LGPL license and is also open source.
    • I like TinyLine's GZIPInputStream (5.32 KB) because it extends and follows the same Decorator pattern as other Java stream classes. It supports skip(), mark(), and other basic I/O functions. The tool's author is very responsive to questions and the library is straightforward to use:

        InputStream in = getClass().getResourceAsStream(db);
         in = new GZIPInputStream (in, 256);

There is a tradeoff for using compression: decompression inevitably increases the application's memory footprint. TinyLine does not publish the additional heap size required by the library, but from my testing, I had to increase my heap size by 35 to 40 KB. A MIDP application that originally runs on Palm Zire (2 MB) now requires 4 MB of RAM to run. This slight increase in memory footprint should not be alarming for most applications and devices, but regression testing is definitely recommended.

Compression is only a first line of defense; a software thief can still easily up-JAR the MIDP and decompress the data files using the right algorithm. I get around this fear by tweaking the compressed data files. First, I rename the data files to .res (or some other bogus file extensions, for example, .xls or .mdb), so the compression algorithm is not easily detectable. Then, I open the compressed data files in HEX mode with UltraEdit (or any HEX editor) and insert (not overwrite) N number of bytes (e.g., #!wx) at the beginning of the compressed file.

Finally, I change the I/O code to skip N number of bytes before reading the GZIPInputStream:

        InputStream in = getClass().getResourceAsStream(db);
         in = new GZIPInputStream (in, 256);

The N number of bytes we skip becomes the secret code between the application and the data file. The modified compressed data file is not recognizable by any decompression software except your code. Of course, a determined hacker can always decompile the classfiles and paw through the obfuscated code to find the number of bytes skipped. But this safeguard should be enough to deter most thefts.

The day when mobile devices are powerful enough to support the full J2SE API stack is not far way. But until that day comes, developers developing standalone MIDP applications must constantly battle with JAR size limitations and data file security. This article's solution attempts to ameliorate the problems. I admit it is a bit clumsy and labor intensive; in the future, I hope a tool will be available to automate this data compression and obfuscation process.

Simon Ru is a senior software engineer at Ebay. He has more than six years of experience developing Java and J2EE applications. He started working with J2ME technology in 2001 and is the author of six commercial J2ME-based educational software projects. The software has received thousands of downloads on Handango. Ru is a graduate of the University of California, Berkeley and a Sun Certified J2EE Developer.

Learn more about this topic