Recommended: Sing it, brah! 5 fabulous songs for developers
JW's Top 5
Larry Wall, (in)famous creator of that (in)famous Perl language, has contributed
a few cents' worth to the debate over "scripting" languages:
I think, to most people, scripting is a lot like obscenity. I can't define it, but
I'll know it when I see it.
Aside from the fact that the original quote reads "pornography" instead of "obscenity",
I get what he's talking about. Finding a good definition for scripting is like trying
to find a good definition for "object-oriented" or "service-oriented" or... come to
think of it, like a lot of the terms that we tend to use on a daily basis. So I'm
right there along with him, assuming that his goal here is to call out a workable
definition for "scripting" languages.
Here are some common memes floating around:
Simple language "Everything is a string" Rapid prototyping Glue language Process control Compact/concise Worse-is-better Domain specific "Batteries included"...I don't see any real center here, at least in terms of technology. If I had to
pick one metaphor, it'd be easy onramps. And a slow lane. Maybe even with some optional
fast lanes.
I'm not sure where some of these memes come from, but some of them I recognize (Simple
language, Rapid prototyping, glue language, compact/concise), some of them are new
to me ("Everything is a string", process control), and some of them I seriously question
the sanity of anybody suggesting them (worse-is-better, domain specific, "batteries
included"). Fortunately he didn't include the "dynamically typed" or "loosely coupled"
memes, which I hear tagged on scripting languages all the time.
But basically, scripting is not a technical term. When we call something a scripting
language, we're primarily making a linguistic and cultural judgment, not a technical
judgment. I see scripting as one of the humanities. It's our linguistic roots showing
through.
I can definitely see the use of the term "scripting" as a term of value judgement,
but I'm not sure I see the idea that scripting languages somehow demonstrate our linguistic
roots.
We then are treated to one-sentence reviews of every language Larry ever programmed
in, starting from his earliest days in BASIC, with some interesting one-liners scattered
in there every so often:
On Ruby: "... a great deal of Ruby's syntax is borrowed from Perl, layered
over Smalltalk semantics."On Lisp: "Is LISP a candidate for a scripting language? While you can certainly
write things rapidly in it, I cannot in good conscience call LISP a scripting language.
By policy, LISP has never really catered to mere mortals. And, of course, mere mortals
have never really forgiven LISP for not catering to them."On JavaScript: "Then there's JavaScript, a nice clean design. It has some
issues, but in the long run JavaScript might actually turn out to be a decent platform
for running Perl 6 on. Pugs already has part of a backend for JavaScript, though sadly
that has suffered some bitrot in the last year. I think when the new JavaScript engines
come out we'll probably see renewed interest in a JavaScript backend." Presumably
he means a new JavaScript backend for Perl 6. Or maybe a new Perl 6 backend for JavaScript.On scripting langauges as a whole: "When I look at the present situation,
what I see is the various scripting communities behaving a lot like neighboring tribes
in the jungle, sometimes trading, sometimes warring, but by and large just keeping
out of each other's way in complacent isolation."
Like the prize at the bottom of the cereal box, if you can labor through all of this,
though, you get treated to one of the most amazing succinct discussions/point-lists
of language design and implementation I've seen in a long while; I've copied that
section over verbatim, though I annotate with my own comments in italics:
early binding / late binding
Binding in this context is about exactly when you decide which routine you're going
to call for a given routine name. In the early days of computing, most binding was
done fairly early for efficiency reasons, either at compile time, or at the latest,
at link time. You still tend to see this approach in statically typed languages. With
languages like Smalltalk, however, we began to see a different trend, and these days
most scripting languages are trending towards later binding. That's because scripting
languages are trying to be dwimmy (Do What I Mean), and the dwimmiest decision is
usually a late decision because you then have more available semantic and even pragmatic
context to work with. Otherwise you have to predict the future, which is hard.So scripting languages naturally tend to move toward an object-oriented point of view,
where the binding doesn't happen 'til method dispatch time. You can still see the
scars of conflict in languages like C++ and Java though. C++ makes the default method
type non-virtual, so you have to say virtual explicitly to get late binding. Java
has the notion of final classes, which force calls to the class to be bound at compile
time, essentially. I think both of those approaches are big mistakes. Perl 6 will
make different mistakes. In Perl 6 all methods are virtual by default, and only the
application as a whole can tell the optimizer to finalize classes, presumably only
after you know how all the classes are going to be used by all the other modules in
the program.[Frankly, I think he leaves out a whole class of binding ideas here, that being
the "VM-bound" notion that both the JVM and the CLR make use of. In other words, the
Java language is early-bound, but the actual linking doesn't take place until runtime
(or link time, as it were). The CLR takes this one step further with its delegates
design, essentially allowing developrs to load a metadata token describing a function
and construct a delegate object--a functor, as it were--around that. This is, in some
ways, a highly useful marriage of both early and late binding.[I'm also a little disturbed by his comments "only the application as a whole
can tell the optimizer to finalize classes, presumably only after you know how all
that classes are going to be used by all the other modules in the program. Since when
can programmers reasonably state that they know how classes are going to be used by
all the other modules in the program? This seems like a horrible set-you-up-for-failure
point to me.]single dispatch / multiple dispatch
In a sense, multiple dispatch is a way to delay binding even longer. You not only
have to delay binding 'til you know the type of the object, but you also have to know
the types of all rest of the arguments before you can pick a routine to call. Python
and Ruby always do single dispatch, while Dylan does multiple dispatch. Here is one
dimension in which Perl 6 forces the caller to be explicit for clarity. I
think it's an important distinction for the programmer to bear in mind, because single
dispatch and multiple dispatch are philosophically very different ideas, based on
different metaphors.With single-dispatch languages, you are basically sending a message to an object,
and the object decides what to do with that message. With multiple dispatch languages,
however, there is no privileged object. All the objects involved in the call have
equal weight. So one way to look at multiple dispatch is that the objects are completely
passive. But if the objects aren't deciding how to bind, who is?Well, it's sort of a democratic thing. All the routines of a given name get together
and hold a political conference. (Well, not really, but this is how the metaphor works.)
Each of the routines is a delegate to the convention. All the potential candidates
put their names in the hat. Then all the routines vote on who the best candidate is,
and the next best, and the next best after that. And eventually the routines themselves
decide what the best routine to call is.So basically, multiple dispatch is like democracy. It's the worst way to do late binding,
except for all the others.But I really do think that's true, and likely to become truer as time goes on. I'm
spending a lot of time on this multiple dispatch issue because I think programming
in the large is mutating away from the command-and-control model implicit in single
dispatch. I think the field of computation as a whole is moving more toward the kinds
of decisions that are better made by swarms of insects or schools of fish, where no
single individual is in control, but the swarm as a whole has emergent behaviors that
are somehow much smarter than any of the individual components.[I think it's a pretty long stretch to go from "multiple dispatch", where the
call is dispatched based not just on the actual type of the recipient but the caller
as well, to suggesting that whole "swarms" of objects are going to influence where
the call comes out. People criticized AOP for creating systems where developers couldn't
predict, a priori, where a call would end up, how will they react to systems where
nondeterminism--having no real idea at source level which objects are "voting", to
use his metaphor--is the norm, not the exception?]eager evaluation / lazy evaluation
Most languages evaluate eagerly, including Perl 5. Some languages evaluate all expressions
as lazily as possible. Haskell is a good example of that. It doesn't compute anything
until it is forced to. This has the advantage that you can do lots of cool things
with infinite lists without running out of memory. Well, at least until someone asks
the program to calculate the whole list. Then you're pretty much hosed in any language,
unless you have a real Turing machine.So anyway, in Perl 6 we're experimenting with a mixture of eager and lazy. Interestingly,
the distinction maps very nicely onto Perl 5's concept of scalar context vs. list
context. So in Perl 6, scalar context is eager and list context is lazy. By default,
of course. You can always force a scalar to be lazy or a list to be eager if you like.
But you can say things likefor 1..Infas long as your loop exits some
other way a little bit before you run into infinity.[This distinction is, I think, becoming one of continuum rather than a binary
choice; LINQ, for example, makes use of deferred execution, which is fundamentally
a lazy operation, yet C# itself as a whole generally prefers eager evaluation where
and when it can... except in certain decisions where the CLR will make the call, such
as with the aforementioned delegates scenario. See what I mean?]eager typology / lazy typology
Usually known as static vs. dynamic, but again there are various positions for the
adjustment knob. I rather like the gradual typing approach for a number of reasons.
Efficiency is one reason. People usually think of strong typing as a reason, but the
main reason to put types into Perl 6 turns out not to be strong typing, but rather
multiple dispatch. Remember our political convention metaphor? When the various candidates
put their names in the hat, what distinguishes them? Well, each candidate has a political
platform. The planks in those political platforms are the types of arguments they
want to respond to. We all know politicians are only good at responding to the types
of arguments they want to have...[OK, Larry, enough with the delegates and the voting thing. It just doesn't work.
I know it's an election year, and everybody wants to get in on the whole "I picked
the right candidate" thing, but seriously, this metaphor is getting pretty tortured
by this point.]There's another way in which Perl 6 is slightly more lazy than Perl 5. We still have
the notion of contexts, but exactly when the contexts are decided has changed. In
Perl 5, the compiler usually knows at compile time which arguments will be in scalar
context, and which arguments will be in list context. But Perl 6 delays that decision
until method binding time, which is conceptually at run time, not at compile time.
This might seem like an odd thing to you, but it actually fixes a great number of
things that are suboptimal in the design of Perl 5. Prototypes, for instance. And
the need for explicit references. And other annoying little things like that, many
of which end up as frequently asked questions.[Again, this is a scenario where smarter virtual machines and execution engines
can help with this--in Java, for example, the JVM can make some amazing optimizations
in its runtime compiler (a.k.a. JIT compiler) that a normal ahead-of-time compiler
simply can't make, such as monomorphic interface calls. One area that I think he's
hinting at here, though, which I think is an interesting area of research and extension,
is that of being able to access the context in which a call is being made, a la the
.NET context architecture, which had some limited functionality in the EJB space,
as well. This would also be a good "middle-ground" for multi-dispatch, since now the
actual dispatch could be done on the basis of the context itself, which could be known,
rather than on random groups of objects that Larry's gathered together for an open
conference on dispatching the method call.... I kid, I kid.]limited structures / rich structures
Awk, Lua, and PHP all limit their composite structures to associative arrays. That
has both pluses and minuses, but the fact that awk did it that way is one of the reasons
that Perl does it differently, and differentiates ordered arrays from unordered hashes.
I just think about them differently, and I think a lot of other people do too.[Frankly, none of the "popular" languages really has a good set-based first-class
concept, whereas many of the functional languages do, and thanks to things like LINQ,
I think the larger programming world is beginning to see the power in sets and set
projections. So let's not limit the discussion to associative arrays; yes, they're
useful, but in five years they'll be useful in the same way that line-numbered BASIC
and use of the goto keyword can still be useful.]symbolic / wordy
Arguably APL is also a kind of scripting language, largely symbolic. At the other
extreme we have languages that eschew punctuation in favor of words, such as AppleScript
and COBOL, and to a lesser extent all the Algolish languages that use words to indicate
blocks where the C-derived languages use curlies. I prefer a balanced approach here,
where symbols and identifiers are each doing what they're best at. I like it when
most of the actual words are those chosen by the programmer to represent the problem
at hand. I don't like to see words used for mere syntax. Such syntactic functors merely
obscure the real words. That's one thing I learned when I switched from Pascal to
C. Braces for blocks. It's just right visually.[Sez you, though I have to admit my own biases agree. As with all things, though,
this can get out of hand pretty quickly if you're not careful. The prosecution presents
People's 1, Your Honor: the Perl programming langauge.]Actually, there are languages that do it even worse than COBOL. I remember one Pascal
variant that required your keywords to be capitalized so that they would stand out.
No, no, no, no, no! You don't want your functors to stand out. It's shouting the wrong
words: IF! foo THEN! bar ELSE! baz END! END! END! END![Oh, now, that's just silly.]
Anyway, in Perl 6 we're raising the standard for where we use punctuation, and where
we don't. We're getting rid of some of our punctuation that isn't really pulling its
weight, such as parentheses around conditional expressions, and most of the punctuational
variables. And we're making all the remaining punctuation work harder. Each symbol
has to justify its existence according to Huffman coding.Oddly, there's one spot where we're introducing new punctuation. After your sigil
you can add a twigil, or secondary sigil. Just as a sigil tells you the basic structure
of an object, a twigil tells you that a particular variable has a weird scope. This
is basically an idea stolen from Ruby, which uses sigils to indicate weird scoping.
But by hiding our twigils after our sigils, we get the best of both worlds, plus an
extensible twigil system for weird scopes we haven't thought of yet.[Did he just say "twigil"? As in, this is intended to be a serious term? As in,
Perl wasn't symbol-heavy enough, so now they're adding twigils that will hide after
sigils, with maybe forgils and fivegils to come in Perl 7 and 8, respectively?]We think about extensibility a lot. We think about languages we don't know how to
think about yet. But leaving spaces in the grammar for new languages is kind of like
reserving some of our land for national parks and national forests. Or like an archaeologist
not digging up half the archaeological site because we know our descendants will have
even better analytical tools than we have.[Or it's just YAGNI, Larry. Look, if your language wants to have syntactic macros--which
is really the only way to have langauge extensibility without having to rewrite your
parser and lexer and AST code every n number of years, then build in syntactic macros,
but really, now you're just emulating LISP, that same language you said wasn't for
mere mortals, waaaay back there up at the top.]Really designing a language for the future involves a great deal of humility. As with
science, you have to assume that, over the long term, a great deal of what you think
is true will turn out not to be quite the case. On the other hand, if you don't make
your best guess now, you're not really doing science either. In retrospect, we know
APL had too many strange symbols. But we wouldn't be as sure about that if APL hadn't
tried it first.[So go experiment with something that doesn't have billions of lines of code scattered
all across the planet. That's what everybody else does. Witness Gregor Kiczales' efforts
with AspectJ: he didn't go and modify Java proper, he experimented with a new language
to see what AOP constructs would fit. And he never proposed AspectJ as a JSR to modify
core Java. Not because he didn't want to, mind you, I know that this was actively
discussed. But I also know that he was waiting to see what a large-scale AOP system
looked like, so we could find the warts and fix them. The fact that he never opened
an AspectJ JSR suggests to me that said large-scale AOP system never materialized.]compile time / run time
Many dynamic languages can eval code at run time. Perl also takes it the other direction
and runs a lot of code at compile time. This can get messy with operational definitions.
You don't want to be doing much file I/O in yourBEGINblocks, for instance.
But that leads us to another distinction:declarational / operational
Most scripting languages are way over there on the operational side. I thought Perl
5 had an oversimplified object system till I saw Lua. In Lua, an object is just a
hash, and there's a bit of syntactic sugar to call a hash element if it happens to
contain code. Thats all there is. [Dude, it's the same with JavaScript/ECMAScript.
And a few other langauges, besides.] They don't even have classes. Anything resembling
inheritance has to be handled by explicit delegation. That's a choice the designers
of Lua made to keep the language very small and embeddable. For them, maybe it's the
right choice.Perl 5 has always been a bit more declarational than either Python or Ruby. I've always
felt strongly that implicit scoping was just asking for trouble, and that scoped variable
declarations should be very easy to recognize visually. Thats why we havemy.
It's short because I knew we'd use it frequently. Huffman coding. Keep common things
short, but not too short. In this case, 0 is too short.Perl 6 has more different kinds of scopes, so we'll have more declarators like
myandour.
But appearances can be deceiving. While the language looks more declarative on the
surface, we make most of the declarations operationally hookable underneath to retain
flexibility. When you declare the type of a variable, for instance, you're really
just doing a kind of tie, in Perl 5 terms. The main difference is that you're tying
the implementation to the variable at compile time rather than run time, which makes
things more efficient, or at least potentially optimizable.[The whole declarational vs operational point here seems more about type systems
than the style of code; in a classless system, a la JavaScript/ECMAScript, objects
are just objects, and you can mess with them at runtime as much as you wish. How you
define the statements that use them, on the other hand, is another axis of interest
entirely. For example, SQL is a declarational language, really more functional in
nature (since functional languages tend to be declarational as well), since the interpreter
is free to tackle the statement in any sub-clause it wishes, rather than having to
start from the beginning and parse right. There's definitely greater distinctions
waiting to be made here, IMHO, since there's still a lot of fuzziness in the taxonomy.]immutable classes / mutable classes
Classes in Java are closed, which is one of the reasons Java can run pretty fast.
In contrast, Ruby's classes are open, which means you can add new things to them at
any time. Keeping that option open is perhaps one of the reasons Ruby runs so slow.
But that flexibility is also why Ruby has Rails. [Except that Ruby now compiles
to the JVM, and fully supports open classes there, and runs a lot faster than the
traditional Ruby interpreter, which means that either the mutability of classes has
nothing to do with the performance of a virtual machine, or else the guys working
on the traditional Ruby interpreter are just morons compared to the guys working on
Java. Since I don't believe the latter, I believe that the JVM has some intrinsic
engineering in it that the Ruby interpreter could have--given enough time and effort--but
simply doesn't have yet. Frankly, from having spelunked the CLR, there's really nothing
structurally restricting the CLR from having open classes, either, so long as the
semantics of modifying a class structure in memory were well understood: concurrency
issues, outstanding objects, changes in method execution semantics, and so on.]Perl 6 will have an interesting mix of immutable generics and mutable classes here,
and interesting policies on who is allowed to close classes when. Classes are never
allowed to close or finalize themselves, for instance. Sorry, for some reason I keep
talking about Perl 6. It could have something to do with the fact that we've had to
think about all of these dimensions in designing Perl 6.class-based / prototype-based
Here's another dimension that can open up to allow both approaches. Some of you may
be familiar with classless languages like Self or JavaScript. Instead of classes,
objects just clone from their ancestors or delegate to other objects. For many kinds
of modeling, it's actually closer to the way the real world works. Real organisms
just copy their DNA when they reproduce. They don't have some DNA of their own, and
an@ISAarray telling you which parent objects contain the rest of their
DNA.[I get nervous whenever people start drawing analogies and start pursuing them
too strongly. Yes, this is how living organisms replicate... but we're not designing
living organisms. A model is just supposed to represent a part of reality, not try
to recreate reality itself. Having said that, though, there's definitely a lot to
be said for classless languages (which don't necessarily have to be prototype-based,
by the way, though it makes sense for them to be). Again, what I think makes the most
sense here is a middle-of-the-road scenario combined with open classes. Objects belong
to classes, but fully support runtime reification of types.]The meta-object protocol for Perl 6 defaults to class-based, but is flexible enough
to set up prototype-based objects as well. Some of you have played around with Moose in
Perl 5. Moose is essentially a prototype of Perl 6's object model. On a semantic level,
anyway. The syntax is a little different. Hopefully a little more natural in Perl
6.passive data, global consistency / active data, local consistency
Your view of data and control will vary with how functional or object-oriented your
brain is. People just think differently. Some people think mathematically, in terms
of provable universal truths. Functional programmers don't much care if they strew
implicit computation state throughout the stack and heap, as long as everything looks pure
and free from side-effects.Other people think socially, in terms of cooperating entities that each have their
own free will. And it's pretty important to them that the state of the computation
be stored with each individual object, not off in some heap of continuations somewhere.Of course, some of us can't make up our minds whether we'd rather emulate the logical
Sherlock Holmes or sociable Dr. Watson. Fortunately, scripting is not incompatible
with either of these approaches, because both approaches can be made more approachable
to normal folk.[Or, don't choose at all, but combine as you need to, a la Scala or F#. By the
way, objects are not "free willed" entities--they are intrinsically passive entities,
waiting to be called, unless you bind a thread into their execution model, which then
makes them "active objects" or sometimes called "actors" (not to be confused with
the concurrency model Actors, such as Scala uses). So let's not get too hog-wild with
that "individual object/live free or die" meme, not unless you're going to differentiate
between active objects and passive objects. Which, I think, is a valuable thing to
differentiate on, FWIW.]info hiding / scoping / attachment
And finally, if you're designing a computer language, there are a couple bazillion
ways to encapsulate data. You have to decide which ones are important. What's the
best way to let the programmer achieve separation of concerns?object / class / aspect / closure / module / template / trait
You can use any of these various traditional encapsulation mechanisms.
transaction / reaction / dynamic scope
Or you can isolate information to various time-based domains.
process / thread / device / environment
You can attach info to various OS concepts.
screen / window / panel / menu / icon
You can hide info various places in your GUI. Yeah, yeah, I know, everything is an
object. But some objects are more equal than others. [NO. Down this road lies
madness, at least at the language level. A given application might choose to, for
reasons of efficiency... but doing so is a local optimization, not something to consider
at the language level itself.]syntactic scope / semantic scope / pragmatic scope
Information can attach to various abstractions of your program, including, bizarrely,
lexical scopes. Though if you think about it hard enough, you realize lexical scopes
are also a funny kind of dynamic scope, or recursion wouldn't work right. Astatevariable
is actually more purely lexical than amyvariable, because it's shared
by all calls to that lexical scope. But even state variables get cloned with closures.
Only global variables can be truly lexical, as long as you refer to them only in a
given lexical scope. Go figure.So really, most of our scopes are semantic scopes that happen to be attached to a
particular syntactic scope.[Or maybe scope is just scope.]
You may be wondering what I mean by a pragmatic scope. That's the scope of
what the user of the program is storing in their brain, or in some surrogate for their
brain, such as a game cartridge. In a sense, most of the web pages out there on the
Internet are part of the pragmatic scope. As is most of the data in databases. The
hallmark of the pragmatic scope is that you really don't know the lifetime of the
container. It's just out there somewhere, and will eventually be collected by that
Great Garbage Collector that collects all information that anyone forgets to remember.
The Google cache can only last so long. Eventually we will forget the meaning of every
URL. But we must not forget the principle of the URL. [This is weirdly
Zen, and either makes no sense at all, or has a scope (pardon the pun) far outside
of that of programming languages and is therefore rendered meaningless for this discussion,
or he means something entirely different from what I'm reading.] That leads us
to our next degree of freedom.use Lingua::Perligata;
If you allow a language to mutate its own grammar within a lexical scope, how do you
keep track of that cleanly? Perl 5 discovered one really bad way to do it, namely
source filters, but even so we ended up with Perl dialects such as Perligata and Klingon.
What would it be like if we actually did it right?[Can it even be done right? Lisp had a lot of success here with syntactic macros,
but I don't think they had scope attached to them the way Larry is looking at trying
to apply here. Frankly, what comes to mind most of all here is the C/C++ preprocessor,
and multiple nested definitions of macros. Yes, it can be done. It is incredibly ugly.
Do not ask me to remember it again.]Doing it right involves treating the evolution of the language as a pragmatic scope,
or as a set of pragmatic scopes. You have to be able to name your dialect, kind of
like a URL, so there needs to be a universal root language, and ways of warping that
universal root language into whatever dialect you like. This is actually near the
heart of the vision for Perl 6. We don't see Perl 6 as a single language, but as the
root for a family of related languages. As a family, there are shared cultural values
that can be passed back and forth among sibling languages as well as to the descendants.I hope you're all scared stiff by all these degrees of freedom. I'm sure there are
other dimensions that are even scarier.But... I think its a manageable problem. I think its possible to still think of Perl
6 as a scripting language, with easy onramps.And the reason I think its manageable is because, for each of these dimensions, it's
not just a binary decision, but a knob that can be positioned at design time, compile
time, or even run time. For a given dimension X, different scripting languages make
different choices, set the knob at different locations.
Somewhere in the universe, a budding programming language designer reads that last
paragraph, thinks to himself, I know! I'll create a language where the programmer
can set that knob wherever they want, even at runtime! Sort of like a "Option open_classes
on; Option dispatch single; Option meta-object-programming off;" thing....
And with any luck, somebody will kill him before he unleashes it on us all.
Meanwhile, I just sit back and wonder, All this from the guy who proudly claimed
that Perl never had a formal design to it whatsoever?