Java Tip 17: Integrating Java with C++

Learn how to use C++ code from within a Java application and how to call from C++ to a Java object

In this article, I'll discuss some of the issues involved in integrating C++ code with a Java application. After a word about why one would want to do this and what some of the hurdles are, I'll build up a working Java program that uses objects written in C++. Along the way, I'll discuss some of the implications of doing this (such as interaction with garbage collection), and I'll present a glimpse of what we can expect in this area in the future.

Why integrate C++ and Java?

Why would you want to integrate C++ code into a Java program in the first place? After all, the Java language was created, in part, to address some of the shortcomings of C++. Actually, there are several reasons why you might want to integrate C++ with Java:

  • Performance. Even if you're developing for a platform with a just-in-time (JIT) compiler, odds are that the code generated by the JIT runtime is significantly slower than the equivalent C++ code. As JIT technology improves, this should become less of a factor. (In fact, in the near future, good JIT technology may well mean that Java runs faster than the equivalent C++ code.)
  • For reuse of legacy code and integration into legacy systems.
  • To directly access hardware or do other low-level activities.
  • To leverage tools that are not yet available for Java (mature OODBMSes, ANTLR, and so on).

If you take the plunge and decide to integrate Java and C++, you do give up some of the important advantages of a Java-only application. Here are the downsides:

  • A mixed C++/Java application cannot run as an applet.
  • You give up pointer safety. Your C++ code is free to miscast objects, access a deleted object, or corrupt memory in any of the other ways that are so easy in C++.
  • Your code may not be portable.
  • Your built environment definitely won't be portable -- you'll have to figure out how to put C++ code in a shared library on all platforms of interest.
  • The APIs for integrating C and Java are works in progress and will very likely change with the move from JDK 1.0.2 to JDK 1.1.

As you can see, integrating Java and C++ is not for the faint of heart! However, if you wish to proceed, read on.

We'll start with a simple example showing how to call C++ methods from Java. We'll then extend this example to show how to support the observer pattern. The observer pattern, in addition to being one of the cornerstones of object-oriented programming, serves as a nice example of the more involved aspects of integrating C++ and Java code. We'll then build a small program to test our Java-wrapped C++ object, and we'll end with a discussion of future directions for Java.

Calling C++ from Java

What's so hard about integrating Java and C++, you ask? After all, SunSoft's Java Tutorial has a section on "Integrating Native Methods into Java Programs" (see Resources). As we'll see, this is adequate for calling C++ methods from Java, but it doesn't give us enough to call Java methods from C++. To do that, we'll need to do a little more work.

As an example, we'll take a simple C++ class that we'd like to use from within Java. We'll assume that this class already exists and that we're not allowed to change it. This class is called "C++::NumberList" (for clarity, I'll prefix all C++ class names with "C++::"). This class implements a simple list of numbers, with methods to add a number to the list, query the size of the list, and get an element from the list. We'll make a Java class whose job it is to represent the C++ class. This Java class, which we'll call NumberListProxy, will have the same three methods, but the implementation of these methods will be to call the C++ equivalents. This is pictured in the following object modeling technique (OMT) diagram:

A Java instance of NumberListProxy needs to hold onto a reference to the corresponding C++ instance of NumberList. This is easy enough, if slightly non-portable: If we're on a platform with 32-bit pointers, we can simply store this pointer in an int; if we're on a platform that uses 64-bit pointers (or we think we might be in the near future), we can store it in a long. The actual code for NumberListProxy is straightforward, if somewhat messy. It uses the mechanisms from the "Integrating Native Methods into Java Programs" section of SunSoft's Java Tutorial.

A first cut at the Java class looks like this:

     public class NumberListProxy {
        static {
            System.loadLibrary("NumberList");
        }
        NumberListProxy()  {
            initCppSide();
        }
        public native void addNumber(int n);
        public native int size();
        public native int getNumber(int i);
        private native void initCppSide();
        private int numberListPtr_;
             // NumberList*
    }

The static section is run when the class is loaded. System.loadLibrary() loads the named shared library, which in our case contains the compiled version of C++::NumberList. Under Solaris, it will expect to find the shared library "libNumberList.so" somewhere in the $LD_LIBRARY_PATH. Shared library naming conventions may differ in other operating systems.

Most of the methods in this class are declared as "native." This means that we will provide a C function to implement them. To write the C functions, we run javah twice, first as "javah NumberListProxy," then as "javah -stubs NumberListProxy." This automatically generates some "glue" code needed for the Java runtime (which it puts in NumberListProxy.c) and generates declarations for the C functions that we are to implement (in NumberListProxy.h).

I chose to implement these functions in a file called NumberListProxyImpl.cc. It begins with some typical #include directives:

    //
    //
      NumberListProxyImpl.cc
    //
    //
    //  This file contains the C++ code that implements the stubs generated
    //  by "javah -stubs NumberListProxy".
  cf. NumberListProxy.c.
    #include <StubPreamble.h>
    #include "NumberListProxy.h"
    #include "NumberList.h"

<StubPreamble.h> is part of the JDK, and includes a number of important system declarations. NumberListProxy.h was generated for us by javah, and includes declarations of the C functions we're about to write. NumberList.h contains the declaration of the C++ class NumberList.

In the NumberListProxy constructor, we call the native method initCppSide(). This method must find or create the C++ object we want to represent. For the purposes of this article, I'll just heap-allocate a new C++ object, although in general we might instead want to link our proxy to a C++ object that was created elsewhere. The implementation of our native method looks like this:

    void NumberListProxy_initCppSide(struct HNumberListProxy *javaObj)
    {
      NumberList* list = new NumberList();
      unhand(javaObj)->numberListPtr_ = (long) list;
    }

As described in the Java Tutorial, we're passed a "handle" to the Java NumberListProxy object. Our method creates a new C++ object, then attaches it to the numberListPtr_ data member of the Java object.

Now on to the interesting methods. These methods recover a pointer to the C++ object (from the numberListPtr_ data member), then invoke the desired C++ function:

    void NumberListProxy_addNumber(struct HNumberListProxy* javaObj,long v)
    {
      NumberList* list = (NumberList*) unhand(javaObj)->numberListPtr_;
      list->addNumber(v);
    }
    long NumberListProxy_size(struct HNumberListProxy* javaObj)
    {
      NumberList* list = (NumberList*) unhand(javaObj)->numberListPtr_;
      return list->size();
    }
    long NumberListProxy_getNumber(struct HNumberListProxy* javaObj, long i)
    {      NumberList* list = (NumberList*) unhand(javaObj)->numberListPtr_;
      return list->getNumber(i);
    }

The function names (NumberListProxy_addNumber, and the rest) are determined for us by javah. For more information on this, the types of arguments sent to the function, the unhand() macro, and other details of Java's support for native C functions, please refer to the Java Tutorial.

While this "glue" is somewhat tedious to write, it's fairly straightforward and works well. But what happens when we want to call Java from C++?

Calling Java from C++

Before delving into how to call Java methods from C++, let me explain why this can be necessary. In the diagram I showed earlier, I didn't present the whole story of the C++ class. A more complete picture of the C++ class is shown below:

As you can see, we're dealing with an observable number list. This number list might be modified from many places (from NumberListProxy, or from any C++ object that has a reference to our C++::NumberList object). NumberListProxy is supposed to faithfully represent all of the behavior of C++::NumberList; this should include notifying Java observers when the number list changes. In other words, NumberListProxy needs to be a subclass of java.util.Observable, as pictured here:

It's easy enough to make NumberListProxy a subclass of java.util.Observable, but how does it get notified? Who will call setChanged() and notifyObservers() when C++::NumberList changes? To do this, we'll need a helper class on the C++ side. Luckily, this one helper class will work with any Java observable. This helper class needs to be a subclass of C++::Observer, so it can register with C++::NumberList. When the number list changes, our helper class' update() method will be called. The implementation of our update() method will be to call setChanged() and notifyObservers() on the Java proxy object. This is pictured in OMT:

Before going into the implementation of C++::JavaObservableProxy, let me mention some of the other changes.

NumberListProxy has a new data member: javaProxyPtr_. This is a pointer to the instance of C++JavaObservableProxy. We'll need this later when we discuss object destruction. The only other change to our existing code is a change to our C function NumberListProxy_initCppSide(). It now looks like this:

    void NumberListProxy_initCppSide(struct HNumberListProxy *javaObj)
    {
      NumberList* list = new NumberList();
      struct HObservable* observable = (struct HObservable*) javaObj;
      JavaObservableProxy* proxy = new JavaObservableProxy(observable, list);
      unhand(javaObj)->numberListPtr_ = (long) list;
      unhand(javaObj)->javaProxyPtr_ = (long) proxy;
    }

Note that we cast javaObj to a pointer to an HObservable. This is OK, because we know that NumberListProxy is a subclass of Observable. The only other change is that we now create a C++::JavaObservableProxy instance and maintain a reference to it. C++::JavaObservableProxy will be written so that it notifies any Java Observable when it detects an update, which is why we needed to cast HNumberListProxy* to HObservable*.

Given the background so far, it may seem that we just need to implement C++::JavaObservableProxy:update() such that it notifies a Java observable. That solution seems conceptually simple, but there is a snag: How do we hold onto a reference to a Java object from within a C++ object?

Maintaining a Java reference in a C++ object

It might seem like we could simply store a handle to a Java object within a C++ object. If this were so, we might code C++::JavaObservableProxy like this:

    class JavaObservableProxy public Observer {
    public:
        JavaObservableProxy(struct HObservable* javaObj, Observable* obs) {
            javaObj_ = javaObj;
            observedOne_ = obs;
            observedOne_->addObserver(this);
        }
        ~JavaObservableProxy() {
            observedOne_->deleteObserver(this);
        }
        void update()  {
            execute_java_dynamic_method(0, javaObj_, "setChanged",
"()V");
        }
    private:
        struct HObservable* javaObj_;
        Observable* observedOne_;
    };

Unfortunately, the solution to our dilemma is not so simple. When Java passes you a handle to a Java object, the handle] will remain valid for the duration of the call. It will not necessarily remain valid if you store it on the heap and try to use it later. Why is this so? Because of Java's garbage collection.

First of all, we're trying to maintain a reference to a Java object, but how does the Java runtime know we're maintaining that reference? It doesn't. If no Java object has a reference to the object, the garbage collector might destroy it. In this case, our C++ object would have a dangling reference to an area of memory that used to contain a valid Java object but now might contain something quite different.

Even if we're confident that our Java object won't get garbage collected, we still can't trust a handle to a Java object after a time. The garbage collector might not remove the Java object, but it could very well move it to a different location in memory! The Java spec contains no guarantee against this occurrence. Sun's JDK 1.0.2 (at least under Solaris) won't move Java objects in this way, but there are no guarantees for other runtimes.

What we really need is a way of informing the garbage collector that we plan to maintain a reference to a Java object, and ask for some kind of "global reference" to the Java object that's guaranteed to remain valid. Sadly, JDK 1.0.2 has no such mechanism. (One will probably be available in JDK 1.1; see the end of this article for more information on future directions.) While we're waiting, we can kludge our way around this problem.

To safely keep references to Java objects, we can simply store the references in a Java vector. If we make this vector part of a singleton object that's globally available, the list will never get garbage collected, and we'll be able to get to the list from C++. Within a C++ object, the "reference" to the Java object can actually be an index into the singleton Vector. In my example, I've called the class that maintains this vector JavaObjectHolder. The structure of our entire system is shown below:

JavaObjectHolder is a straightforward Java class. It simply has methods to add an object (returning its index), remove an object, and get an object. For simplicity, I made these methods static. The declarations of the methods look like this:

    class JavaObjectHolder {
        public static int addObject(Object o)  {
            ...        }
        public static void removeObject(int handle) {
            ...
        }
        public static Object getObject(int handle) {
            ...
        }
    }

(The complete source of this class, including exception specifications, can be found at the end of this article.)

Now that we have a way of maintaining references to Java objects, we're in a position to actually implement C++::JavaObservableProxy. The header is straightforward:

#if !defined(JavaObservableProxy_h) #define JavaObservableProxy_h #include "Observer.h" #include "Observable.h" class JavaObservableProxy : public Observer { public: JavaObservableProxy(struct HObservable* javaObj, Observable* obs); ~JavaObservableProxy(); void update(); private: int javaObjectId_; Observable* observedOne_; }; #endif

Within the implementation file (JavaObservableProxy.cc), we first define a convenience function to get us the class descriptor of the JavaObjectHolder class:

    static ClassClass* javaObjectHolder()
        // Give a pointer to the class descriptor for JavaObjectHolder
    {
        static ClassClass* result = 0;
        if (result == 0)  {
            result = FindClass(0, "JavaObjectHolder", TRUE);
            assert(result != 0);
        }
        return result;
    }

This is faster than calling the Java library function FindClass() each time.

Next, we need to write the constructor. The constructor simply calls JavaObjectHolder.addObject to convert the handle into an integer that's safe to store on the heap:

    JavaObservableProxy::JavaObservableProxy(
            struct HObservable* javaObj,
            Observable* obs)
    {
        javaObjectId_ = execute_java_static_method(
                            0, javaObjectHolder(), "addObject",
                            "(Ljava/lang/Object;)I",
                            javaObj);
        observedOne_ = obs;
        observedOne_->addObserver(this);
    }

When the C++::JavaObservableProxy is destroyed, we'll need to have the matching call to javaObjectHolder.removeObject()...

    JavaObservableProxy::~JavaObservableProxy()
    {
        observedOne_->deleteObserver(this);
        execute_java_static_method(
            0, javaObjectHolder(), "removeObject", 
"(I)V", javaObjectId_);
        javaObjectId_ = -1;
    }

Finally, we have the infrastructure we need. All that's left is to implement C++::JavaObservableProxy::update()...

    void JavaObservableProxy::update()
    {
        HObject* javaObj = (HObject*)
            execute_java_static_method(
                0, javaObjectHolder(), "getObject", 
"(I)Ljava/lang/Object;",
                javaObjectId_);
            // If an exception occurred, get back to the Java runtime, because
            // invoking another method would clear the exception flag.
        if (exceptionOccurred(EE()))
            return;
        execute_java_dynamic_method(0, javaObj, "setChanged", 
"()V");
        if (exceptionOccurred(EE()))
            return;
        execute_java_dynamic_method(0, javaObj, "notifyObservers", 
"()V");
    }

This method gets a handle to the Java proxy (by calling JavaObjectHolder.getObject()), then executes setChanged() and notifyObservers() on the Java object.

Garbage collection revisited

In this discussion, we've created a number of instances, but we haven't said anything about how to clean them up. Remember, this is C++, and we need to manually dispose of the objects we create!

An obvious place to dispose of the C++ objects we create would be within the finalize() method of NumberListProxy. Unfortunately, this won't work because there is a circular reference: NumberListProxy maintains a reference to C++::JavaObservableProxy, and C++::JavaObservableProxy maintains a reference to NumberListProxy (by going through the static Vector inside JavaObjectHolder). There is no way for Java's garbage collector to detect this circular reference, so NumberListProxy instance will never be collected.

To get around this, we must resort to manual memory management. We add a method called "detach()" to NumberListProxy. When the Java side is done with a NumberListProxy instance, it must call NumberListProxy.detach(). This method can free all of the C++ instances that are created. (Some languages support a concept called "weak references" that can solve problems of this nature in an automatic fashion. Weak references are not a part of Java, and a discussion of them would be beyond the scope of this article.)

Putting it all together

To demonstrate the system we've just developed, I created a simple application to exercise it. This application creates a number list, establishes an observer, and adds a few numbers to the list. Whenever a number is added, the observer is notified, and it prints a message to stdout. The Java observer is quite simple:

import java.util.*; class NumberListObserver implements Observer { NumberListObserver(NumberListProxy subject) { subject_ = subject; subject.addObserver(this); } /** * Called when the subject changed * @param o not used * @param arg not used **/ public void update(Observable o, Object arg) { synchronized (subject_) { // Don't want size() to change under us! int sz = subject_.size(); System.out.print(" The list now has: "); for (int i = 0; i < sz; i++) { if (i > 0) System.out.print(", "); System.out.print(subject_.getNumber(i)); } } System.out.println(""); } private NumberListProxy subject_; // Thing being observed }

The main program looks like this:

    import java.util.*;
    class TestNumberList {
        public static void main(String args[]) {
            NumberListProxy model = new NumberListProxy();
            NumberListObserver obs = new NumberListObserver(model);
            System.out.println("Adding 3 to the list...");
            model.addNumber(3);
            System.out.println("Adding 42 to the list...");
            model.addNumber(42);
            System.out.println("Adding 666 to the list...");
            model.addNumber(666);
            System.out.println("Adding 7 to the list...");
            model.addNumber(7);
            model.deleteObserver(obs);
            model.detach();
        }
    }

Running the program yields this output:

    billf@pluto:~/javaC++Article/src$ java TestNumberList
    Adding 3 to the list...
        The list now has:  3
    Adding 42 to the list...
        The list now has:  3, 42
    Adding 666 to the list...
        The list now has:  3, 42, 666
    Adding 7 to the list...
        The list now has:  3, 42, 666, 7

JDK 1.1 and beyond

In their recent announcement, JavaSoft informed us that JDK 1.1 will have a "new Java native method interface" (see Resources). Hopefully, this new interface will provide a mechanism for getting a global reference to a Java object.

Netscape has comprehensive documentation on their JRI native method interface (see Resources). JRI ships with Netscape 3.0, and it provides everything we need in the way of registering global references. Unfortunately, it's only implemented by Netscape. Hopefully, JavaSoft will implement something similar, if not exactly the same. (Dare I hope that the JDK 1.1 will have something along the lines of Netscape's JRI?)

ILOG has announced a project called TwinPeaks (being developed with JavaSoft -- see Resources), which promises to "deliver Internet-ready C++ business application components to developers and customers." I expect this means that it will automate the writing of some of the glue code that we wrote by hand in this article. It will probably include other useful tools -- perhaps a debugger that can step from Java into C++.

Getting the source code

If you'd like the complete source code for the program developed in this article (complete with Solaris Makefile), click for javaAndC++.tar.gz, or for javacpp.zip. If you'd just like to browse, here are the source files:

Conclusions

Integrating C++ classes into a Java application is fairly straightforward, although somewhat messy. We expect the standard native method APIs to improve in the near future. Once this happens, the general approach outlined in this article will continue to work, but the implementation will be easier (and performance will be better).

Accessing C++ from Java can be worth the inconvenience. It opens up large bodies of "legacy" C++ code to potential reuse within Java applications. It also lets us exploit the power of C++, where C++ has an advantage -- that is, in performance (at least for now), in directly accessing hardware, in doing other low-level activities, and so on. With this power comes danger, however. We lose the pointer safety that Java provides us, opening ourselves up to memory-corruption bugs. In other words, integrating C++ with Java can be a powerful technique but one to be used with care!

When not planning his next trip to France or raising beagles with foreign names, Bill Foote likes to program in Java, Eiffel, CLOS, Smalltalk and, yes, C++. Current projects include a strange brew of Java, C++, and ANTLR to be used in semiconductor test equipment.

Learn more about this topic

Join the discussion
Be the first to comment on this article. Our Commenting Policies
See more