Page 2 of 7
And so there was a general consensus that we wanted to have garbage collection in there but we didn't want to live with the hiccups that you get when your average garbage collector kicks in. All garbage collectors have the effect that they tend to shut down the system for some interval of time. And some of them you can actually tune the interval.
So, like in HotSpot, there's a limited amount of control that's possible over how long each GC step takes. But if you're writing an application whose timeliness needs to be tighter than what something like the HotSpot VM can do, then you're in trouble.
One of the things that we decided fairly early on was that an easy out on this problem would be to, say, only use "realtime garbage collectors." That's garbage collectors that have essentially zero pause time because they either back out or are doing things truly in parallel.
A couple of people in the group did a literature survey and they concluded that real realtime garbage collectors essentially didn't exist. There were a lot of things which were claiming to be realtime garbage collectors that were close but not good enough. And the usual extent of closeness was that the people were saying, "Well, we're realtime because we never pause more than a millisecond." Or, "We're realtime because only one chance in a billion will we ever actually back off and do a full mark-and-sweep."
Unfortunately, both of those invalidate the GC algorithm in some applications. In the realtime world, people are worried about the worst case. They don't care at all about the average case or the best case, they care about the worst case. And if the worst case is that this algorithm will actually do a full mark-and-sweep -- shut the thing down for seconds -- then it doesn't work, even if it happens once in 10 or 15 days of running. If the thing that's running is the thing that's controlling the flutter of the wingtip of an F-16, after 10 days the airplane turns into dust. And that doesn't work.
For GC algorithms that have fairly small minimum times, often people will claim that they have a realtime algorithm, and then they'll say it's got a maximum pause time of a millisecond. And that works for some applications and doesn't for others. One way to characterize realtime algorithms is based on the maximum latency that they can tolerate. And some of them can live just fine with a millisecond; some need something much finer than that.
So what we ended up doing was defining a way to allow a special kind of thread to run even while the garbage collector is running. And if you know much about garbage collectors then you think about what is it that has to be true about a thread if it's going to run while the garbage collector is running. You end up with a fairly onerous set of restrictions. Namely that whatever the thread does, it does not allow it to touch an object in the garbage collector heap, and it's not allowed to move a pointer to one of these objects around.
And so we've defined two subclasses of thread, one called the realtime thread, which is a thread that has all kinds of extra scheduling parameters besides just the priority. And then there's another subclass of that called the no-heap realtime thread, which is one that also has the ability to run while the garbage collector is running. But those threads are not allowed to access heap memory. And the way that they are able to work is that there's a way to allocate what are called "immortal objects." An immortal object is one that is allocated -- it stays there forever and it never, ever goes away.