Validation with Java and XML Schema, Part 4

Build Java representations of schema constraints and apply them to Java data

Validation. It still makes my stomach queasy to say the word; I often have nightmares of sitting at my desk, drooling on my keyboard as I type in hundreds and thousands of lines of tedious JavaScript to validate my HTML forms. My memory then moves on to more boredom, trying to put up with the lines upon lines of servlet code that handled specific constraints and often completely obscured the true purpose of my servlet code. And unless you're new to Java, you certainly share my pain.

Read the whole "Validation with Java and XML Schema" series:

This series of articles, and the code included, seeks to at least limit, if not completely stop, the drool and boredom, the hours spent looking out a window (or a cubicle wall!) wishing that you were outside playing hockey or listening to the Dave Matthews Band in concert. I'm thrilled at the response so far, and am glad you've come along for the ride. Before diving into this final article in the series, let me give the newcomers a little history.

The first trilogy

Well, I'm happy to see you've made it this far. As an author, there is a certain wariness about moving beyond a series that has more than three parts. It's the "trilogy" fear, I suppose; other than Star Wars, there aren't many trilogies that have successful fourth outings. I mean, did any of you really anxiously await Friday the 13th, Part IV? In any case, I am glad that you are still reading.

In Part 1, I talked at a conceptual level about problems common to applications with user input and raised the issue of validation. I discussed using Java properties files as well as hard-coded solutions and the various shortcomings of those approaches. Finally, I suggested that XML Schema might provide a means to represent data constraints easily and completely and previewed the framework that is the focus of this series. In Part 2, I demonstrated how you could use some simple Java classes to represent data constraints defined in an XML Schema. This laid the groundwork for building a set of these Constraint instances and then comparing Java data (Strings, ints, Dates, etc.) to these data constraints. In Part 3, I rounded out the framework, providing the SchemaParser class to parse the XML Schema and build up these constraints. And finally, I showed you the Validator class, which would take data and a constraint name and return whether or not the data was valid with regard to the supplied named constraint. For those of you who haven't read these parts, I strongly recommend that you read the first trilogy of articles before moving on.

So far, we have a pretty good set of classes to help us out in validating data in applications, and particularly in servlets. But there are some things missing in the framework; while validity of data is determined, there is no means to report what problems occur when data is invalid. This will be the focus of this article. Additionally, I've yet to "put it all together" and give you a solid concrete example. I'll also do that in this article and hopefully let you better see validation in action.

Improving error reporting

The framework's biggest problem as it stands so far is the lack of error reporting. If you recall from Part 3, the Validator class has methods to take in a piece of data, as well as a constraint name, and indicate whether that data is valid for the supplied constraint. The method used, isValid, returns a boolean value: true if the data is valid, false if it is not. Simple enough, right? Well, as I mentioned in Part 3, this gives you an answer, but not a complete one. For example, there is no means to report back to the user what problems occurred. Of course, the user needs this exact information to correct problems and resubmit her data. Imagine a complicated form with 20 or 25 fields; the user fills out this form (which could take a while), submits it, and makes a mistake. The user doesn't know what mistake, and the application simply replies, "Validation error. Try again." This hardly makes the user happy; in fact, the user may just close her browser or go elsewhere for her needs. This is a recipe for failure in any line of business! We clearly need to make some improvements in this area. There are two basic approaches that I suggest: the exception approach, where exceptions indicate validation problems, and the message approach, where the validation result is a Java String, which may have a message indicating what validation errors occurred.

The exception approach

The most obvious solution to this problem is to use Java's exception-handling method. In this scenario, the isValid() method would throw an exception when validation problems occurred. To paint this more concretely, look at a new class, InvalidDataException, that could be thrown when data is invalid:

package org.enhydra.validation;
/**
 *
 *   The InvalidDataException represents a problem occurring
 *     from data being invalid with regard to a specific data constraint.
 *
 */
public class InvalidDataException extends Exception {
    /**
     *
     *  This will throw an exception that simply states that
     *    some sort of incorrect/invalid data was supplied.
     *
     */
    public InvalidDataException() {
        super("Invalid Data Supplied.");
    }
    /**
     *
     *  This will throw an exception that simply states that
     *    some sort of incorrect/invalid data was supplied, along with a
message
     *    indicating the problem.
     *
     *
     * @param message String message indicating what data failed
validation.
     */
    public InvalidDataException(String message) {
        super("Invalid Data Supplied: " + message);
    }
}

This exception could be thrown, along with an informative error message, by the isValid() method. Here's the modified method to throw this exception when a problem occurs:

    /**
     *
     *  This will validate a data value (in String format) against a
     *    specific constraint and throw an exception if there is a problem.
     *
     *
     * @param constraintName the identifier in the constraints to validate
this data against.
     * @param data String data to validate.
     */
    public void checkValidity(String constraintName, String data)
        throws InvalidDataException {
        // Validate against the correct constraint
        Object o = constraints.get(constraintName);
        // If no constraint, then everything is valid
        if (o == null) {
            return;
        }
        Constraint constraint = (Constraint)o;
        // Validate data type
        if (!correctDataType(data, constraint.getDataType())) {
            throw new InvalidDataException("Incorrect data type");
        }
        // Validate against allowed values
        if (constraint.hasAllowedValues()) {
            List allowedValues = constraint.getAllowedValues();
            if (!allowedValues.contains(data)) {
                throw new InvalidDataException("Disallowed value");
            }
        }
        // Validate against range specifications
        try {
            double doubleValue = new Double(data).doubleValue();
            if (constraint.hasMinExclusive()) {
                if (doubleValue <= constraint.getMinExclusive()) {
                    throw new InvalidDataException("Value is not large
enough");
                }
            }
            if (constraint.hasMinInclusive()) {
                if (doubleValue < constraint.getMinInclusive()) {
                    throw new InvalidDataException("Value is not large
enough");
                }
            }
            if (constraint.hasMaxExclusive()) {
                if (doubleValue >= constraint.getMaxExclusive()) {
                    throw new InvalidDataException("Value is not small
enough");
                }
            }
            if (constraint.hasMaxInclusive()) {
                if (doubleValue > constraint.getMaxInclusive()) {
                    throw new InvalidDataException("Value is not small
enough");
                }
            }
        } catch (NumberFormatException e) {
            // If it couldn't be converted to a number, the data type isn't
            //   numeric anyway, as it would have already failed,
            //   so this can be ignored.
        }
        // If we got here, all tests were passed
        // No return value needed
    }

Notice that I also changed the method name to checkValidity() since it no longer returns a boolean value (the isXXX() style method names are generally reserved for methods that return a true/false result). Instead of returning false when problems occur, an InvalidDataException is thrown to the calling program, with an error message indicating the problem. In the sample, I've been fairly terse, but you could make this error message more descriptive, adding the value being tested, the value or values to which it must conform, and other more extensive error details.

In fact, you could extend this even further by creating additional exceptions, all extending InvalidDataException. For example, consider this exception hierarchy:

  • InvalidDataException
    • IncorrectDataTypeException
    • DisallowedValueException
    • ValueTooLowException
    • ValueTooHighValueException
    • etc...

Each of these could have convenience constructors. For example:

public class ValueTooLowException extends InvalidDataException {
    public ValueTooLowException(String value, String minValue) {
        super("The value supplied, " + value + ", was lower than " +
              "the minimum required value, " + minValue);
    }
}

With this approach, you can easily report problems through the various exceptions and provide callers with detailed information about what problems occurred. In fact, this is the approach that most of you suggested to me and supported. However, the approach causes a subtle problem. Consider the client code that would use the Validator class:

    Validator validator = Validator.getInstance(schemaURL);
    // Validate the various pieces of data that we got earlier
    try {
        validator.checkValidity("shoeSize", shoeSize);
        validator.checkValidity("width", width);
        validator.checkValidity("brand", brand);
        validator.checkValidity("numEyelets", numEyelets);
    } catch (InvalidDataException e) {
        errorMessage = e.getMessage();
    }
    // Report back to the client the errorMessage value

This looks pretty good, right? Well, only the first validation problem will be reported. For example, assume that the user enters a shoe size of "11" (which is legal), a width of "F" (which is not legal), a brand of "V-Form" (which is legal), and for the number of eyelets, the user accidentally enters "@2" instead of "22" (which is not legal, of course). The checkValidity() method returns normally from the call on the shoe size, and then an exception result from the call on the width (because "F" is not legal). The program flow moves to the exception block, and the only message reported is "The width entered, F, is not in the allowed set of values: A, B, C, D, or DD. Please enter one of these values." However, the user is not informed that her entry for the number of eyelets is invalid. So she corrects her error, resubmits the form, and gets another error on the number of eyelets. This is typically frustrating for the user and causes confusion. Why couldn't both errors be reported at the same time?

To prevent this problem, the simple block of code shown above has to become this block:

    Validator validator = Validator.getInstance(schemaURL);
    StringBuffer errorBuffer = new StringBuffer();
    // Validate the various pieces of data that we got earlier
    try {
        validator.checkValidity("shoeSize", shoeSize);
    } catch (InvalidDataException e) {
        errorBuffer.append(e.getMessage());
    }
    try {
        validator.checkValidity("width", width);
    } catch (InvalidDataException e) {
        errorBuffer.append(e.getMessage());
    }
    try {
        validator.checkValidity("brand", brand);
    } catch (InvalidDataException e) {
        errorBuffer.append(e.getMessage());
    }
    try {
        validator.checkValidity("numEyelets", numEyelets);
    } catch (InvalidDataException e) {
        errorBuffer.append(e.getMessage());
    }
    // Convert buffer to error message
    String errorMessage = errorBuffer.toString();
    // Report back to the client the errorMessage value

Suddenly, the code's elegance in the first snippet is completely lost! We're back to the terrible spaghetti code of Part 1, which we were trying to avoid -- all because we want to get all error messages, not just the first one that occurs. So as you can see, using exceptions doesn't work quite as well as it might seem at first. For that reason, it makes sense to move on to a better approach, where you can achieve clean coding with the same result.

Related:
1 2 3 Page 1
Page 1 of 3