Developers can build intelligent robots with Java, as it provides APIs for programming systems that can see, hear, speak, move, and even learn, using neural networks, which are algorithms that mimic our brain. (Please see "Futurama: Using Java Technology to Build Robots That Can See, Hear, Speak, and Move," by Steve Meloan (Sun Developer Network, July 2003).)
This article shows how to develop a robot that can learn by using the backpropagation algorithm, a basic neural network, and implementing it on a Lego Roverbot. Using both the algorithm and Java, the Roverbot—a Lego robot vehicle—can learn some basic rules for moving forward, backward, left, and right.
In this article, we use the Lego Mindstorms Robotics Invention System 2.0 for building the Lego robot; leJOS 2.1.0, a little Java operating system for downloading and running Java programs inside the Roverbot; and J2SE for compiling the Java programs under leJOS.
The Lego Mindstorms Robotics Invention System (RIS) is a kit for building and programming Lego robots. It has 718 Lego bricks including two motors, two touch sensors, one light sensor, an infrared tower, and a robot brain called the RCX.
The RCX is a large brick that contains a microcontroller and an infrared port. You can attach the kit's two motors (as well as a third motor) and three sensors by snapping wire bricks on the RCX. The infrared port allows the RCX to communicate with your desktop computer through the infrared tower.
In this article, we use a Roverbot as it is constructed in the Lego Mindstorms Constructopedia, the guide for constructing robots. This Roverbot, as shown in Figure 1, has been configured to use all three sensors and two motors included in Lego Mindstorms RIS 2.0.
leJOS is a small Java-based operating system for the Lego Mindstorms RCX. Because the RCX contains just 32 KB of RAM, only a small subset of the JVM and APIs can be implemented on the RCX. leJOS includes just a few commonly used Java classes from
java.util, and thus fits well on the RCX.
You must load the RAM with the Lego firmware, or, in our case, with the leJOS firmware, and your programs. The firmware contains a bytecode interpreter, which can run programs downloaded from RCX code.
For setting up your leJOS installation, please take a look at Jonathan Knudsen's article "Imaginations Run Wild with Java Lego Robots," (JavaWorld, February 2001), Programming Lego Mindstorms with Java (Syngress Publishing, 2002), or the leJOS readme file contained in the leJOS zip file, which you can download from the leJOS homepage.
If we want to build intelligent machines, we should model the human brain. Early in the 1940s, the neurophysiologist Warren McCulloch and the mathematician Walter Pitts began working on the idea of building an intelligent machine out of artificial neurons. One of the earliest neural network models was the perceptron, an invention of F. Rosenblat in 1962. A perceptron can learn; it models a neuron by taking a weighted sum of its inputs and sending an output of 1 if the sum is greater than some adjustable threshold value, otherwise it sends 0. If a perceptron can compute, it can learn to compute. Figure 2 shows a neuron and Figure 3 shows a perceptron.
The inputs (x1, x2, x3, ..., xn) and connection weights (w1, w2, w3, ..., wn) in the figure are typically real values. If the presence of some feature xi tends to cause the perceptron to fire, the weight wi will be positive; if the feature xi inhibits the perceptron, the weight wi will be negative. As Elaine Rich and Kevin Knight note in their book Artificial Intelligence (McGraw-Hill, 1990), "the perceptron itself consists of the weights, the summation processor, and the adjustable threshold processor. Learning is a process of modifying the values of the weights and the threshold." The authors recommend implementing the threshold as another weight w0 because this weight can be thought of as the propensity of the perceptron to fire irrespective of its input.
A backpropagation network is a fully connected, layered, and feed-forward neural network (see Figure 4). Network activation flows in one direction only: from the input layer to the output layer, passing through the hidden layer. Each unit in a layer is connected in the forward direction to every unit in the next layer. Weights between units encode the network's knowledge.
A backpropagation network usually starts with a random set of connection weights. The network adjusts its weights based on some learning rules each time it sees a pair of input-output vectors. Each pair of vectors goes through two stages of activation: a forward pass and a backward pass. The forward pass involves presenting a sample input to the network and letting activations flow until they reach the output layer. During the backward pass, the network's actual output (from the forward pass) is compared with the target output, and errors are computed for the output units. The weights connected to the output units can be adjusted to reduce those errors. The error estimates of the output units are then used to derive error estimates for the units in the hidden layers. Finally, errors are propagated back to the connections stemming from the input units.
After each round of forward-backward passes, the system "learns" incrementally from the input-output pair and reduces the difference (error) between the network's predicted output and the actual output. After extensive training, the network will eventually establish the input-output relationships through the adjusted weights on the network.
The backpropagation algorithm
Given a set of input-output vector pairs, you can compute a set of weights for a neural network that maps inputs onto corresponding outputs.
Let A be the number of units in the input layer, as determined by the length of the training input vectors. Let C be the number of units in the output layer. Now choose B, the number of units in the hidden layer. As shown in Figure 4, the input and hidden layers each have an extra unit used for thresholding; therefore, the units in these layers will sometimes be indexed by the ranges (0,...,A) and (0,..., B). We denote the activation levels of the units in the input layer by xj, in the hidden layer by hj, and in the output layer by oj. Weights connecting the input layer to the hidden layer are denoted by w1ij, where the subscript i indexes the input units and j indexes the hidden units. Likewise, weights connecting the hidden layer to the output layer are denoted by w2ij, with i indexing hidden units and j indexing output units.
The backpropagation algorithm has the following steps:
Initialize the network weights. Initially, all connection weights are set randomly to numbers between -0.1 and 0.1:
w1ij = random( -0.1, 0.1 ) for all i = 0, ..., A, j = 1, ..., B
w2ij = random( -0.1, 0.1 ) for all i = 0, ..., B, j = 1, ..., C
Initialize the activations of the threshold units. For each layer, its threshold unit is set to 1 and should never change:
x0 = 1.0
- Choose an input-output pair. Suppose the input vector is xi and the target output vector is yi. Assign activation levels to the input units.
Propagate the activations from the units in the input layer to the units in the hidden layer using the activation function:
Note that i ranges from 0 to A. w1oj is the thresholding weight for hidden unit j. x0 is always 1.0.
Propagate the activations from the units in the hidden layer to the units in the output layer:
Again, the thresholding w2oj for output units j plays a role in the weighted summation. h0 is always 1.0.
Compute the errors of the units in the output layer, denoted δ2j. Errors are based on the network's actual output (oj) and the target output (yj):
δ2j = oj (1 - oj) (yj - oj) for all j = 1, ..., C
Compute the errors of the units in the hidden layer, denoted δ1j:
Adjust the weights between the hidden layer and output layer. The learning rate is denoted η; its function is the same as in perceptron learning. A reasonable value of η is 0.35:
Δw2ij = η δ2j hi for all i = 0, ..., B, j = 1, ..., C
Adjust the weights between the input layer and the hidden layer:
Δw1ij = η δ1j xi for all i = 0, ..., A, j = 1, ..., B
- Go to Step 4 and repeat. When all the input-output pairs have been presented to the network, one epoch has been completed. Repeat Steps 4 to 10 for as many epochs as desired.
The activation function has a sigmoid shape. Since infinite weights would be required for the actual outputs of the network to reach 0.0 and 1.0, binary target outputs (the yj's of Steps 4 and 7 above) are usually given as 0.1 and 0.9 instead. The sigmoid is required by backpropagation because the derivation of the weight update rule requires that the activation function be continuous and differentiable.
The Lego Mindstorms backpropagation network
We want to model a backpropagation network for our Roverbot (Figure 5). The robot has three inputs (two touch sensors and one light sensor) and two outputs (the two motors). So, we can use a three-layer backpropagation network as shown in Figure 6, where we change the unit index to begin with 0; recall the use of index in Java arrays.
Note: The use of a three-layered network and three units in the hidden layer is just an arbitrary decision influenced by teaching purposes.
To define input-output vector pairs for use in the backpropagation network, from the robot input-output (sensor-motor), we must identify what the robot is going to learn. We define four basic behavior rules:
- Moving forward: If Sensor 1 is off, and Sensor 2 is over a white floor, and Sensor 3 is off, then Motor A and Motor C go forward (Roverbot goes forward)
- Moving right: If Sensor 1 is on, then Motor A goes forward, and Motor C goes backward (Roverbot turns right)
- Moving left: If Sensor 3 is on, then Motor A goes backward, and Motor C goes forward (Roverbot turns left)
- Moving backward: If Sensor 2 is over a black floor, then Motor A and Motor C go backward (Roverbot goes backward)
We translate these rules to training examples for the backpropagation network as shown in Figures 7 and 8, where S1 = Sensor 1, M-A = Motor A, and so on.
The input-output vector pairs are the examples we use to train the backpropagation network. So, based on its sensor states, our robot will learn to move forward, right, left, and backward. But what would happen if both touch sensors were on? The robot would not learn that case (rule or example), but the backpropagation network would give it an emergent behavior.
What emergent behavior? Will the robot go forward, backward, left, or right? You will get the answer by compiling the program, downloading it to the RCX, and pressing the Run button to run the program and see the robot behavior.
The Java classes
To implement the backpropagation algorithm, divide the code into two Java classes: