Javalution

Play around with Snobol and Infiqs

1 2 3 Page 2
Page 2 of 3

Strings are stored in variables, which must exist before they are accessed. Each variable's name consists of alphanumeric characters and periods. Unlike many other languages, a variable name can begin with a digit; it can begin with a period as well. Because Snobol3 is a case-sensitive language, two variables with the same name but different cases (x versus X) are considered distinct.

Strings are assigned to variables via the varname = string syntax, where = is Snobol3's assignment operator. The string concatenation operator appends a string, expressed literally or via a variable name, to another literal string or variable name; two strings are concatenated into a single string by placing whitespace characters between them:

 name = 'Jeff'
Name = 'Java'
SYSPOT = 'Hello, ' name '. How are you?'
SYSPOT = 'Hello, ' Name '. How are you?'

Although Snobol3 has no numeric data types, this language specifies a few operators that perform arithmetic on literal strings and variables that contain integers. These operators include addition (+), subtraction (-), multiplication (*), division (/), and negation (-). Arithmetic operators have the usual precedence; parentheses can be used to change precedence:

 

i = '100' SYSPOT = i + '8'

SYSPOT = i - '8' SYSPOT = i * '8' SYSPOT = i / '8' SYSPOT = '-8' SYSPOT = (i + '3') * '2'

A Snobol3 program consists of statements—one statement per line. Each statement is specified according to the [label] statement body/branch syntax. If the label field is present, it must begin in Column 1. If the branch field is present, it must begin with a forward slash. Even the statement body field is optional.

A label is a sequence of letters and digits that identifies the destination of a branch operation. By convention, I capitalize labels. A branch is a go-to that transfers execution to another part of a Snobol3 program. (Unfortunately, Snobol3 doesn't offer any other control-flow mechanisms.) There are three kinds of branches:

  • Branch on failure: The /F(label) or /f(label) syntax transfers execution to label if the statement body evaluates to failure.
  • Branch on success: The /S(label) or /s(label) syntax transfers execution to label if the statement body evaluates to success.
  • Unconditional branch: Occasionally, you'll want to transfer execution without regard for a statement body's success/failure. The /(label) syntax unconditionally transfers execution to label.

Success and failure are important to Snobol3's branch on failure and branch on success. They are signals that indicate whether a statement body succeeded or failed. For example, an attempt to read from the standard input device when there is no more input results in failure. It's helpful to think of success as a Boolean true value and failure as a Boolean false value, although they are never stored in variables.

Snobol3 associates input and output operations with the predefined SYSPOT and SYSPPT variables. When a string is assigned to SYSPOT, the string is printed to the standard output device on its own line. Similarly, each time SYSPPT is accessed, a line of text (less the trailing new line) is read from the standard input device. SYSPOT and SYSPPT must be uppercase.

Heimbigner has extended Snobol3 with predefined stdin, stdout, and stderr variables, which must be lowercase. The stdin variable is an alias for SYSPPT. Similarly, stdout aliases SYSPOT. (There is no variable for stderr to alias.) Listing 4 demonstrates this alias usage in a program that copies standard input to standard output.

Listing 4. copy.sno

 

* copy.sno

* Copy standard input to standard output.

n = '0'

COPY stdout = SYSPPT /F(DONE) n = n + '1' /(COPY)

DONE stderr = n ' lines were copied.'

Listing 4 tracks the number of lines copied from standard input to standard output. As long as SYSPPT returns a line, the line is tallied and an unconditional branch takes execution back to the COPY-labeled statement. But once SYSPPT indicates no more lines (a failure), the branch on failure transfers execution to the DONE-labeled statement.

Snobol3 comes with several built-in functions. For example, EQUALS(s, t) compares two strings for equality, in the Java s.equals(t) sense, and signals success if both strings are equal. Similarly, UNEQL(s, t) compares two strings for inequality, in the Java !s.equals(t) sense; success is signaled if both strings are not equal. Listing 5 demonstrates EQUALS(s, t).

Listing 5. cmp.sno

 

* cmp.sno

* Compare text entered via the standard input device against a password.

stdout = 'Enter your password' word = stdin

EQUALS(word, 'password') /S(SUCCESS)

stdout = 'Password not valid' /(FINISH)

SUCCESS

stdout = 'Password successfully entered'

FINISH

Although passwords shouldn't be specified in source code, Listing 5 nicely shows off EQUALS(s, t), which signals success and enables the branch to SUCCESS if the entered and hard-coded passwords are equal. Don't place whitespace characters between EQUALS (or any other built-in function) and its open ( character: an error occurs. Also, most function names can be specified in lowercase.

In addition to using built-in functions, you can define your own functions with Snobol3's built-in DEFINE(s,t,u...) function: s names the function and lists its parameters, t specifies the label that identifies the function's first statement, and u lists variables to be given local scope within the function. Variables outside of functions have global scope.

Invoking a user-defined function causes its arguments to be evaluated and assigned to corresponding parameters, which are treated as variables with local scope; listed local variables are initialized to empty strings. The function returns a value by assigning this value to a variable with the same name as the function: this pseudo-variable has local scope. Listing 6 demonstrates function definition, invocation, and return.

Listing 6. fact.sno

 

* fact.sno

* Compute a list of factorials from 0 through 10 inclusive.

define('fact(n)', 'FACT') /(MAIN) FACT fact = .le(n, '1') '1' /S(RETURN) fact = n * fact(n - '1') /(RETURN)

MAIN i = '0' MAIN1 stdout = i ' ' fact(i) i = i + '1' .LE(i, '10') /S(MAIN1)

Listing 6 uses define('fact(n)', 'FACT') /(MAIN) to define a recursive factorial function. The /(MAIN) unconditional branch is specified to transfer execution around the function body; otherwise the interpreter outputs an error. The first statement in the function body is identified by the FACT label.

The labeled fact = .le(n, '1') '1' /S(RETURN) statement determines whether or not recursion continues. This decision is made by comparing n with 1. If n is less than or equal to 1, the built-in .LE(i, j) function signals success, 1 assigns to pseudo-variable fact, and /S(RETURN) returns execution from the function.

If .LE(i, j) signals failure, fact = n * fact(n - '1') /(RETURN) executes. The /(RETURN) unconditional branch is necessary to ensure that recursion takes place: assignment to a pseudo-variable requires a return from a recursive call. To see the results of this recursion, examine the following output:

 0 1
1 1
2 2
3 6
4 24
5 120
6 720
7 5040
8 40320
9 362880
10 3628800

Snobol3's greatest strength lies in its support for pattern matching, the process of examining a subject string for a certain combination of characters. The subject is the first statement element after the (optional) label. This element is then followed by one or more whitespace characters, and the pattern—I show this syntactically below:

 [label] subject pattern

The pattern match succeeds if the pattern is found in the subject; otherwise the match fails. To determine what statement to execute following a success or a failure, a branch field follows the pattern. Although you can specify both success and failure branches, as shown syntactically below, I haven't found a good reason to do this:

 [label] subject pattern /S(label1) F(label2)

Pattern matching is based on pattern elements, special character sequences that perform different kinds of pattern matches. Three examples are StringMatch (match entire pattern), Arb (match arbitrary characters), and Len (match a fixed-length string). Listing 7 demonstrates these pattern elements; it also shows you how to replace the portion of a subject that matches a pattern.

Listing 7. pm.sno

 

* pm.sno

* Pattern matching.

subject = 'The quick brown fox jumped over the lazy ox.'

stdout = 'Subject: The quick brown fox jumped over the lazy ox.' stdout = 'Enter a pattern'

pattern = stdin stdout = ''

* StringMatch: Match entire pattern

subject pattern /F(NOMATCH)

stdout = pattern ' found in ' subject /(NEXT)

NOMATCH stdout = pattern ' not found in ' subject

NEXT

x = '?'

text = 'Mountain'

* Arb: Match arbitrary characters

text 'o' *x* 'a'

stdout = x

* Len: Match fixed-length string

text *x/'4'*

stdout = x

* Len with replacement

text *x/'4'* = 'Foun'

stdout = 'x = ' x ' text = ' text

Listing 7's text 'o' *x* 'a' statement demonstrates an Arb match in the text subject variable. This match returns all characters (in variable x) between o and a. Statement text *x/'4'* demonstrates a Len match: the first four characters return in x; statement text *x/'4'* = 'Foun' replaces these characters with Foun.

Options fix

The s3-1.0.jar distribution file's s3-1.0\doc directory includes a reference manual: refman.html. This manual offers detailed information about the Snobol3 interpreter. One section lists and describes various command line options that you can pass to the interpreter. Examples include -debug, -stacktrace, -lint, and -dquotes.

Unfortunately, not everything in this manual is correct. For example, if you specify -dquotes (which lets you quote strings within double quotes), the interpreter spits out an error message referring to -dquotes as an unknown option. The same is true of the -escapes option, which allows strings to include standard escape sequences. Fortunately, you can fix these problems.

To fix the -dquotes problem, first change to the s3-1.0\src directory, which contains the Snobol3.java source file. Load this file into a text editor. Near the top of the source code, you'll find a static String [] formals array specification. Append "dquotes?" to the array, close the editor, move up one directory level, and invoke Ant on build.xml.

After Ant announces a successful build, create a simple test.sno file containing this line: stdout = "double quotes supported". (Make sure stdout does not begin in Column 1.) Then invoke java -jar s3.jar -dquotes test.sno. If all goes well, you should see the message double quotes supported instead of an error message.

From Snobol3 to Snobol4

This completes my coverage of Snobol3. To learn more about this language, you'll need to download various source files from the Internet and interpret them with the Snobol3 language interpreter. Unfortunately, many of these files require Snobol4. Because Heimbigner hasn't released a Java-based Snobol4 interpreter, you'll have to implement this interpreter yourself. The following two resources can help you with this task:

  • JPattern: In late 2005, Heimbigner released JPattern, a Java-based product that offers Snobol4-style pattern matching.
  • "A Snobol4 Tutorial": This tutorial will help you learn those Snobol4 features you need to implement. (See Resources for links to these Websites.)

Heimbigner's Snobol3 reference manual is helpful for adding built-in functions to Snobol3. It helped me introduce Snobol4's DATE() function (which returns the current date/time) and &TRIM keyword, as a TRIM(s) function (which removes leading/trailing spaces from string s), to the Snobol3 language interpreter—which required several changes to s3-1.0\src\Primitive.java:

  1. I placed an import java.util.Date; statement in the imports section, which I needed to access the Date class.
  2. I next appended the following function definition method calls to the static void definePrimitives() method:

     

    fcnDef("date",(p=new $Date())); fcnDef("DATE",p);

    fcnDef("trim",(p=new $Trim())); fcnDef("TRIM",p);

    The fcnDef() method calls store function definitions in a java.util.HashMap called functions, located in the Snobol3 class. In keeping with Snobol3's policy of letting you call a function by specifying its name entirely in uppercase or in lowercase, I stored the same function definition with both versions of the function's name.

  3. I lastly introduced the following $Date and $Trim classes at the end of the source file:

     

    class $Date extends Primitive { public $Date() {super();} public int nargs() {return 0;} public void execute(VM vm, PrimFunction fcn) throws Failure { vm.cc = false; setReturn (vm, new Date ().toString ()); } }

    class $Trim extends Primitive { public $Trim() {super();} public int nargs() {return 1;} public void execute(VM vm, PrimFunction fcn) throws Failure { String s = (String)vm.pop(); s = s.trim (); vm.cc = false; setReturn (vm, s); } }

    The Primitive class's public int nargs() method returns the number of arguments that can be passed to a function. Because many Snobol3 built-in functions require two arguments, this method was overridden in the $Date and $Trim subclasses. Also, some of the Snobol3 language interpreter's VM-related methods and fields were accessed.

After completing the necessary additions to Primitive.java, I used Ant to rebuild s3.jar. Following a successful build, I created a simple source file to test the new DATE() and TRIM(s) functions—and Snobol3's built-in SIZE(s) function, which returns a string's length. This source file's content, followed by the appropriate output, appears below:

 

stdout = DATE()

str = ' ABC ' stdout = 'str = [' str '], size = ' size(str)

str = TRIM(str) stdout = 'str = [' str '], size = ' size(str)

Wed Jul 12 16:00:39 CDT 2006 str = [ ABC ], size = 9 str = [ABC], size = 3

1 2 3 Page 2
Page 2 of 3