Let's take a look at the real world for a moment. Assume that the *Z* axis is pointing straight out ahead of you. Take a step plus one meter along this *Z* axis and see what you see. Step back to where you were. I now want you to move the entire world minus one meter along the *Z* axis. You should see exactly the same thing as before. You can, in fact, repeat this for any movement that you make in your viewing position: a world, as seen from location *p*, looks exactly the same as a world translated by *-p* and viewed from location *(0,0,0)*.

Exactly the same applies to your viewing orientation: a world, as seen from a head rotation of *r*, looks exactly the same as a world rotated by *-r* and viewed straight ahead. And if we combine translation and rotation, we find that a world viewed from location *p* at rotation *r* looks exactly the same as a world translated by *-p*, rotated by *-r*, and viewed straight ahead from the origin.

You might wonder why I'm having you act all this out. Well, it turns out that it's much easier for us to draw a scene onto the screen if we are viewing it straight ahead from the origin than if we are positioned arbitrarily. So, to allow us to view our terrain from any position, we will first perform this inverse transformation of the world, and then perform the simplified world-to-screen transformation.

Let's define a few graphics-related buzzwords:

*Worldspace*is the world in its original form -- it is defined with respect to an arbitrary coordinate system that we chose when we defined the terrain.*Eyespace*is the world as seen from your current viewing position -- after performing the aforementioned inverse transformation. If you take the viewer's location and orientation in worldspace, and then transform the world by the inverse of these values, you have the world in eyespace. The origin is at the eye, the*Z*axis sticks straight out from the eye,*Y*up,*X*to the right.*Screenspace*is the eyespace world transformed into*(x,y)*pixel locations on the screen.

**Viewer representation**

To perform this worldspace-to-eyespace transformation, we need a way to represent the viewing location and orientation. Representing the location is easy: *p _{you} = (x_{you}, y_{you}, z_{you})*. Representing the viewing direction is also easy. The complexity is that we need to choose which one of many representations to use.

Unfortunately, we can't simply use a vector to represent the direction in which you are looking, because this does not include information about your current *up* direction.

One option is to use three angles *(roll, pitch, yaw)* to represent the angles of rotation in three dimensions needed to transform from looking straight ahead to your current viewing direction. This is, however, inefficient and awkward to manipulate.

Another option is to use a 3x3 rotation matrix *M*. This matrix essentially consists of the unit vector facing along the direction that you are looking, the unit vector facing directly up, and the unit vector facing to your right. This representation is fairly common, as well as easy to manipulate and understand. It is, however, slow for some operations and subject to numerical error.

We will instead go with a third option: a rotational *quaternion* -- a four-element datastructure *(w, x, y, z)*. The *(x, y, z)* part represents a vector in space, and the *w* part represents a rotation about this vector. *Any* viewing direction can be represented by this compact, efficient structure.

**Quaternions for rotation**

Quaternions can be used for many purposes. However, we will consider them strictly for their application to three-dimensional rotations. Let's start with some general quaternion mathematics.

*q = (w, x, y, z)*

|*q*|^{2} = w^{2} + x^{2} + y^{2} + z^{2}

*q ^{-1} = (w /* |

*q*|

*|*

^{2}, -x /*q*|

*|*

^{2}, -y /*q*|

*|*

^{2}, -z /*q*|

^{2})*q _{a}q_{b} = (w_{a}w_{b} - x_{a}x_{b} - y_{a}y_{b} - z_{a}z_{b}, w_{a}x_{b} + x_{a}w_{b} + y_{a}z_{b} - z_{a}y_{b}, w_{a}y_{b} + y_{a}w_{b} + z_{a}x_{b} - x_{a}z_{b}, w_{a}z_{b} + z_{a}w_{b} + x_{a}y_{b} - y_{a}x_{b})*

Okay, we have some quaternion mathematics, but how the heck do we rotate a point using quaternions? Say we have a point *p* and a rotational quaternion *q _{r}*. To rotate the point, we simply convert it to a quaternion with a zero

*w*coefficient, perform some multiplications, and convert it back:

*p = (x, y, z)*

*q _{p} = (0.0, x, y, z)*

*q' _{p} = q_{r}q_{p}q_{r}^{-1} = (w', x', y', z')*

*p' = (x', y', z')*

We have taken our rotational quaternion, multiplied it by the point to be transformed, and multiplied this by the inverse of the rotational quaternion. The result is our rotated point.

If speed is of prime importance, there is an alternative: convert from a rotational quaternion to a rotational matrix, and then rotate a series of points with this matrix. But let's not get too sidetracked!

I've already mentioned that a rotational quaternion is a rotation about a particular vector, or axis. So how do we calculate the quaternion for a particular rotation? Well, if we have a unit-vector *v*, and we want a quaternion to represent a rotation of *w* about this, we compute the quaternion as follows:

*Vector: v = (x, y, z);* |*v*| *= 1.0;*

*Rotation: w*

*Rotational quaternion: (cos(w), x . sin(w), y . sin(w), z . sin(w))*

Next, what if we have one rotation that we wish to concatenate with another? We could either explicitly perform the two rotations in sequence on our points, or we can simply multiply the two quaternions and use the result as the combined rotation operation. Remember, however, that *q _{a}q_{b} != q_{b}q_{a}*.

For example, start by looking straight forward. Then rotate your head down as far as it can go. Then, from this perspective, rotate your head right as far as it can go. You should end up looking to the right. Now, start again by looking straight forward. Rotate your head right as far as it can go. Then, from this perspective, rotate your head down as far as it can go. You should end up looking down. We did the same operations; look down and look right, just in different orders, and we ended up looking in completely different directions.

The following code implements the basic quaternion mathematics that we will need:

public class Quaternion { private double w, x, y, z; private Quaternion inverse; public Quaternion (double w, double x, double y, double z) { this.w = w; this.x = x; this.y = y; this.z = z; this.inverse = null; } public Quaternion inverse () { double scale = 1.0 / (x * x + y * y + z * z + w * w); return new Quaternion (w * scale, - x * scale, - y * scale, - z * scale); } public Quaternion multiply (Quaternion q) { double qx = q.x, qy = q.y, qz = q.z, qw = q.w; double rw = w * qw - x * qx - y * qy - z * qz; double rx = w * qx + x * qw + y * qz - z * qy; double ry = w * qy + y * qw + z * qx - x * qz; double rz = w * qz + z * qw + x * qy - y * qx; return new Quaternion (rw, rx, ry, rz); } public Triple rotate (Triple t) { if (inverse == null) inverse = inverse (); double iw = inverse.w, ix = inverse.x, iy = inverse.y, iz = inverse.z; double tx = t.x, ty = t.y, tz = t.z; double aw = - x * tx - y * ty - z * tz; double ax = w * tx + y * tz - z * ty; double ay = w * ty + z * tx - x * tz; double az = w * tz + x * ty - y * tx; double bx = aw * ix + ax * iw + ay * iz - az * iy; double by = aw * iy + ay * iw + az * ix - ax * iz; double bz = aw * iz + az * iw + ax * iy - ay * ix; return new Triple (bx, by, bz); } public static Quaternion newRotation (double r, double x, double y, double z) { double len = Math.sqrt (x * x + y * y + z * z); double sin = Math.sin (r / 2.0); double cos = Math.cos (r / 2.0); double tmp = sin / len; return new Quaternion (cos, x * tmp, y * tmp, z * tmp); } }

**Worldspace to eyespace**

We now have the basics of the mathematics that we need to make the transformation from worldspace to eyespace. Now, how about performing the computations themselves:

Triple loc = new Triple (0.5, 3.0, -2.0); Quaternion rot = Quaternion.newRotation (-.82, 1.0, 0.0, 0.0); Triple[][] eyeMap = new Triple[steps + 1][steps + 1]; for (int i = 0; i <= steps; ++ i) { for (int j = 0; j <= steps; ++ j) { Triple p = map[i][j]; Triple t = p.subtract (loc); Triple r = rot.rotate (t); eyeMap[i][j] = r; } }

**Eyespace to screenspace**

We have the world transformed so that we can view it looking straight ahead from the origin. Now we need to transform this new world onto our screen. Recall from the last article that we considered the screen to be a window in space; we fired rays from the eye onto the pixels in this window and onwards into the virtual scene. We determined what each ray intersected and this was how we computed pixel colors.

In this application we're performing the exact same calculation, but in reverse. Again, the screen is a window in space between the viewer and the scene. This time, however, we will draw a line between each vertex of our terrain and the viewer, and then determine where this line intersects the viewing window. By using this approach, we can determine the closest screen pixel that corresponds to each vertex point, and then use these pixel locations to draw the terrain triangles on screen.

This calculation is actually quite straightforward. We must first decide how far away our virtual window is from the origin. (We'll call this distance `hither`

.) If any vertex falls on the near side of this plane, we won't transform it -- we're only drawing what we can see *through* the window. Otherwise, we take the vertex *x* and *y* coordinates, divide these by *z*, and then scale the result up to fit on the screen.

The amount we scale the result depends upon how wide a field of view (*fov*) we want. If we want a horizontal field of view , then we want a scale of *width / tan(fov / 2)*. A commonly used field of view is about 60 degrees (stick your nose up to your monitor to get a somewhat geometrically accurate picture). The value depends on how much of your reality is taken up by the window displaying this applet.

double hither = .1; double fov = Math.PI / 3; XY[][] eyeMap = new XY[steps + 1][steps + 1]; int width = getSize ().width, height = getSize ().height; double scale = width / 2 / Math.tan (fov / 2); for (int i = 0; i <= steps; ++ i) { for (int j = 0; j <= steps; ++ j) { Triple p = eyeMap[i][j]; double x = p.x, y = p.y, z = p.z; if (z >= hither) { double tmp = scale / z; scrMap[i][j] = new XY (width / 2 + (int) (x * tmp), height / 2 - (int) (y * tmp)); } else { scrMap[i][j] = null; } } }

So we now have screenspace coordinates of each terrain vertex. This means that we have pixel locations for each triangle vertex of our scene. Now all we need do is draw these triangles!

## Rasterization

Rasterization is a term meaning to paint objects onto the screen. Some people call it rendering; others call it scan conversion. We'll stick with the big word for this section.

The simplest way for us to rasterize our terrain is to call `Graphics.fillPolygon()`

for each triangle in the scene. This is not, however, particularly useful. We end up with flat-shaded triangles and, worse still, we end up with some triangles from the back of our terrain being drawn in front of triangles that should be at the front.

public void paint (Graphics g) { for (int i = 0; i < numTriangles; ++ i) { XY xy0 = scrMap[triangles[i].i[0]][triangles[i].j[0]], xy1 = scrMap[triangles[i].i[1]][triangles[i].j[1]], xy2 = scrMap[triangles[i].i[2]][triangles[i].j[2]]; double dot = - map[triangles[i].i[0]][triangles[i].j[0]] .subtract (loc).normalize ().dot (triangles[i].n); if ((dot > 0.0) && (xy0 != null) && (xy1 != null) && (xy2 != null)) { int[] x = { xy0.x, xy1.x, xy2.x }, y = { xy0.y, xy1.y, xy2.y }; g.setColor (triangles[i].color); g.fillPolygon (x, y, 3); } } }

You might think that we could simply sort our triangles and draw the furthest away first. As the following diagram shows, however, this approach doesn't always work. Some situations just can't be resolved with a simple depth sort. The two depicted triangles share a common edge. In each configuration of the following diagram, the third point of the smaller triangle (in relation to the Z axis) is behind this common edge and in front of the third point of the larger triangle -- so a depth sort won't change their ordering. In the configuration on the left, however, the small triangle is in front of its partner and, thus, should be drawn last. In the configuration on the right, it is behind and, thus, should be drawn first. To resolve this type of situation we need a more comprehensive algorithm.

There are a variety of solutions to this problem. One solution is to build span lists describing the portion of each scan line (row of pixels) on the screen that is covered by each triangle, and resolve simple overlaps using this list (Romney, Watkins and Evans, 1969). Instead, however, we will go with the *z-buffer* solution.

When drawing our triangles into an image buffer, in addition to storing the red, green, and blue color coefficients of each pixel, we will also store (in a separate z-buffer) the pixel's depth in the scene. Before we draw any pixel to the screen, we first consult this z-buffer to see if we've already drawn a closer point. If we have, then we don't draw the pixel; otherwise, we store the color and depth information as usual. In this way, we resolve any triangle overlap to pixel precision.

**Rasterizing a triangle**