Excerpt from Scripting in Java: Languages, Frameworks, and Patterns.
By Dejan Bosanac
Published by Addison Wesley Professional
Introduction to Scripting in Java: Languages, Frameworks, and Patterns
The main topic of this book is the synergy of scripting technologies and the Java platform. I describe projects Java developers can use to create a more powerful development environment, and some of the practices that make scripting useful.
Before I start to discuss the application of scripting in the Java world, I summarize some of the theory behind scripting in general and its use in information technology infrastructure. This is the topic of the first two chapters of the book, and it gives us a better perspective of scripting technology as well as how this technology can be useful within the Java platform.
To begin, we must define what scripting languages are and describe their characteristics. Their characteristics greatly determine the roles in which they could (should) be used. In this chapter, I explain what the term scripting language means and discuss their basic characteristics.
At the end of this chapter, I discuss the differences between scripting and system-programming languages and how these differences make them suitable for certain roles in development.
The definition of a scripting language is fuzzy and sometimes inconsistent with how scripting languages are used in the real world, so it is a good idea to summarize some of the basic concepts about programming and computing in general. This summary provides a foundation necessary to define scripting languages and discuss their characteristics.
Let's start from the beginning. Processors execute machine instructions, which operate on data either in the processors' registers or in external memory. Put simply, a machine instruction consists of a sequence of binary digits (0s and 1s) and is specific to the particular processor on which it runs. Machine instructions consist of the operation code telling the processor what operation it should perform, and operands representing the data on which the operation should be performed.
For example, consider the simple operation of adding a value contained in one register to the value contained in another. Now let's imagine a simple processor with an 8-bit instruction set, where the first 5 bits represent the operation code (say, 00111 for register value addition), and the registers are addressed by a 3-bit pattern. We can write this simple example as follows:
00111 001 010
In this example, I used 001 and 010 to address registers number one and two (R1 and R2, respectively) of the processor.
This basic method of computing has been well known for decades, and I'm sure you are familiar with it. Various kinds of processors have different strategies regarding how their instruction sets should look (RISC or CISC architecture), but from the software developer's point of view, the only important fact is the processor is capable of executing only binary instructions. No matter what programming language is used, the resulting application is a sequence of machine instructions executed by the processor.
What has been changing over time is how people create the order in which the machine instructions are executed. This ordered sequence of machine instructions is called a computer program. As hardware is becoming more affordable and more powerful, users' expectations rise. The whole purpose of software development as a science discipline is to provide mechanisms enabling developers to craft more complex applications with the same (or even less) effort as before.
A specific processor's instruction set is called its machine language. Machine languages are classified as first-generation programming languages. Programs written in this way are usually very fast because they are optimized for the particular processor's architecture. But despite this benefit, it is hard (if not impossible) for humans to write large and secure applications in machine languages because humans are not good at dealing with large sequences of 0s and 1s.
In an attempt to solve this problem, developers began creating symbols for certain binary patterns, and with this, assembly languages were introduced. Assembly languages are second-generation programming languages. The instructions in assembly languages are just one level above machine instructions, in that they replace binary digits with easy-to-remember keywords such as ADD, SUB and so on. As such, you can rewrite the preceding simple instruction example in assembly language as follows:
ADD R1, R2
In this example, the ADD keyword represents the operation code of the instruction, and R1 and R2 define the registers involved in the operation. Even if you observe just this simple example, it is obvious assembly languages made programs easier for humans to read and thus enabled creation of more complex applications.
Although they are more human-oriented, however, second-generation languages do not extend processor capabilities by any means.
Enter high-level languages, which allow developers to express themselves in higher-level, semantic forms. As you might have guessed, these languages are referred to as third-generation programming languages. High-level languages provide various powerful loops, data structures, objects, and so on, making it much easier to craft many applications with them.
Over time, a diverse array of high-level programming languages were introduced, and their characteristics varied a great deal. Some of these characteristics categorize programming languages as scripting (or dynamic) languages, as we see in the coming sections.
Also, there is a difference in how programming languages are executed on the host machine. Usually, compilers translate high-level language constructs into machine instructions that reside in memory. Although programs written in this way initially were slightly less efficient than programs written in assembly language because of early compilers' inability to use system resources efficiently, as time passed compilers and machines improved, making system-programming languages superior to assembly languages. Eventually, high-level languages became popular in a wide range of development areas, from business applications and games to communications software and operating system implementations.
But there is another way to transform high-level semantic constructs into machine instructions, and that is to interpret them as they are executed. This way, your applications reside in scripts, in their original form, and the constructs are transformed at runtime by a program called an interpreter. Basically, you are executing the interpreter that reads statements of your application and then executes them. Called scripting or dynamic languages, such languages offer an even higher level of abstraction than that offered by system-programming languages, and we discuss them in detail later in this chapter.
Languages with these characteristics are a natural fit for certain tasks, such as process automation, system administration and gluing existing software components together; in short, anywhere the strict syntax and constraints introduced by system-programming languages were getting in the way between developers and their jobs. A description of the usual roles of scripting languages is a focus of Chapter 2, "Appropriate Applications for Scripting Languages."
But what does all this have to do with you as a Java developer? To answer this question, let's first briefly summarize the history of the Java platform. As platforms became more diverse, it became increasingly difficult for developers to write software that can run on the majority of available systems. This is when Sun developed Java, which offers "write once, run anywhere" simplicity.
The main idea behind the Java platform was to implement a virtual processor as a software component, called a virtual machine. When we have such a virtual machine, we can write and compile the code for that processor, instead of the specific hardware platform or operating system. The output of this compilation process is called bytecode, and it practically represents the machine code of the targeted virtual machine. When the application is executed, the virtual machine is started, and the bytecode is interpreted. It is obvious an application developed in this way can run on any platform with an appropriate virtual machine installed. This approach to software development found many interesting uses.
The main motivation for the invention of the Java platform was to create an environment for the development of easy, portable, network-aware client software. But mostly because of performance penalties introduced by the virtual machine, Java is now best suited in the area of server software development. It is clear as personal computers increase in speed, more desktop applications are being written in Java. This trend only continues.
One of the basic requirements of a scripting language is to have an interpreter or some kind of virtual machine. The Java platform comes with the Java Virtual Machine (JVM), which enables it to be a host to various scripting languages. There is a growing interest in this area today in the Java community. Few projects exist that are trying to provide Java developers with the same power developers of traditional scripting languages have. Also, there is a way to execute your existing application written in a dynamic language such as Python inside the JVM and integrate it with another Java application or module.
This is what we discuss in this book. We take a scripting approach to programming, while discussing all the strengths and weaknesses of this approach, how to best use scripts in an application architecture, and what tools are available today inside the JVM.