Newsletter sign-up
View all newsletters

Enterprise Java Newsletter
Stay up to date on the latest tutorials and Java community news posted on JavaWorld

Sponsored Links

Optimize with a SATA RAID Storage Solution
Range of capacities as low as $1250 per TB. Ideal if you currently rely on servers/disks/JBODs

Lexical analysis and Java: Part 1

Learn how to convert human readable text into machine readable data using the StringTokenizer and StreamTokenizer classes

  • Print
  • Feedback

Page 5 of 5

  • Click on the custom syntax button. In the left-hand White Space Character box, type the letter e and then click Add Blank Chars. Now type in the phrase "this test is eerie," and click the Tokenize! button.

  • Click Reset Syntax! and then type in "this is /* a comment */ a test," and tokenize that. Now change the comment character to c and click the Tokenize! button. Now clear the comment character and click the box labelled "/* Comments," and tokenize it again.


Wrapping up

The two classes -- StringTokenizer and StreamTokenizer -- are very useful for parsing information that is in textual form. Once you get to know them they may become a standard part of your programming technique. Next month I will walk through the design of a simple application that uses a StreamTokenizer class in its operation.

About the author

Chuck McManis is currently the director of system software at FreeGate Corp. FreeGate is a venture-funded start-up that is exploring opportunities in the Internet marketplace. Before joining FreeGate, McManis was a member of the Java group. He joined the Java group just after the formation of FirstPerson Inc. and was a member of the portable OS group (the group responsible for the OS portion of Java). Later, when FirstPerson was dissolved, he stayed with the group through the development of the alpha and beta versions of the Java platform. He created the first "all Java" home page on the Internet when he did the programming for the Java version of the Sun home page in May 1995. He also developed a cryptographic library for Java and versions of the Java class loader that could screen classes based on digital signatures. Before joining FirstPerson, Chuck worked in the operating systems area of SunSoft developing networking applications, where he did the initial design of NIS+.
  • Print
  • Feedback

Resources
  • The two applets included above were designed an implemented using Visual Cafe PR2m which is available on Windows 95 for free at:
    http://cafe.symantec.com.
  • A good discussion of lexical analysis can be found in the book CompilersPrinciples, Techniques and Tools, by Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman, 1986, ISBN 0-201-10088-6:
    http://www.awl.com
  • The source for past Java In Depth columns can be found at:
    http://www.mcmanis.com/~cmcmanis/java/javaworld.