Smalltalk is an object-oriented, dynamically typed, reflective programming language. Smalltalk was created as the language to underpin the "new world" of computing exemplified by "human–computer symbiosis." It was designed and created in part for educational use, more so for constructionist learning, at the Learning Research Group (LRG) of Xerox PARC by Alan Kay, Dan Ingalls, Adele Goldberg, Ted Kaehler, Scott Wallace, and others during the 1970s.
The language was first generally released as Smalltalk-80. Smalltalk-like languages are in continuing active development and have gathered loyal communities of users around them. ANSI Smalltalk was ratified in 1998 and represents the standard version of Smalltalk.
As in other object-oriented languages, the central concept in Smalltalk-80 (but not in Smalltalk-72) is that of an object. An object is always an instance of a class. Classes are "blueprints" that describe the properties and behavior of their instances. For example, a GUI's window class might declare that windows have properties such as the label, the position and whether the window is visible or not. The class might also declare that instances support operations such as opening, closing, moving and hiding. Each particular window object would have its own values of those properties, and each of them would be able to perform operations defined by its class.
A Smalltalk object can do exactly three things:
- Hold state (references to other objects).
- Receive a message from itself or another object.
- In the course of processing a message, send messages to itself or another object.
The state an object holds is always private to that object. Other objects can query or change that state only by sending requests (messages) to the object to do so. Any message can be sent to any object: when a message is received, the receiver determines whether that message is appropriate. Alan Kay has commented that despite the attention given to objects, messaging is the most important concept in Smalltalk: "The big idea is 'messaging'—that is what the kernel of Smalltalk/Squeak is all about (and it's something that was never quite completed in our Xerox PARC phase)."
Smalltalk is a "pure" object-oriented programming language, meaning that, unlike Java and C++, there is no difference between values which are objects and values which are primitive types. In Smalltalk, primitive values such as integers, booleans and characters are also objects, in the sense that they are instances of corresponding classes, and operations on them are invoked by sending messages. A programmer can change or extend (through subclassing) the classes that implement primitive values, so that new behavior can be defined for their instances—for example, to implement new control structures—or even so that their existing behavior will be changed. This fact is summarized in the commonly heard phrase "In Smalltalk everything is an object", which may be more accurately expressed as "all values are objects", as variables are not.
Since all values are objects, classes themselves are also objects. Each class is an instance of the metaclass of that class. Metaclasses in turn are also objects, and are all instances of a class called Metaclass. Code blocks—Smalltalk's way of expressing anonymous functions—are also objects.
Smalltalk-80 syntax is rather minimalist, based on only a handful of declarations and reserved words. In fact, only six "keywords" are reserved in Smalltalk: true, false, nil, self, super, and thisContext. These are actually called pseudo-variables, identifiers that follow the rules for variable identifiers but denote bindings that the programmer cannot change. The true, false, and nil pseudo-variables are singleton instances. self and super refer to the receiver of a message within a method activated in response to that message, but sends to super are looked up in the superclass of the method's defining class rather than the class of the receiver, which allows methods in subclasses to invoke methods of the same name in superclasses. thisContext refers to the current activation record. The only built-in language constructs are message sends, assignment, method return and literal syntax for some objects. From its origins as a language for children of all ages, standard Smalltalk syntax uses punctuation in a manner more like English than mainstream coding languages. The remainder of the language, including control structures for conditional evaluation and iteration, is implemented on top of the built-in constructs by the standard Smalltalk class library. (For performance reasons, implementations may recognize and treat as special some of those messages; however, this is only an optimization and is not hardwired into the language syntax.)
The adage that "Smalltalk syntax fits on a postcard" refers to a code snippet by Ralph Johnson, demonstrating all the basic standard syntactic elements of methods:
exampleWithNumber: x | y | true & false not & (nil isNil) ifFalse: [self halt]. y := self size + super size. #($a #a "a" 1 1.0) do: [ :each | Transcript show: (each class name); show: ' ']. ^x < y
The following examples illustrate the most common objects which can be written as literal values in Smalltalk-80 methods.
The following list illustrates some of the possibilities.
42 -42 123.45 1.2345e2 2r10010010 16rA000
The last two entries are a binary and a hexadecimal number, respectively. The number before the 'r' is the radix or base. The base does not have to be a power of two; for example 36rSMALLTALK is a valid number equal to 80738163270632 decimal.
Characters are written by preceding them with a dollar sign:
Strings are sequences of characters enclosed in single quotes:
To include a quote in a string, escape it using a second quote:
'I said, ''Hello, world!'' to them.'
Double quotes do not need escaping, since single quotes delimit a string:
'I said, "Hello, world!" to them.'
Two equal strings (strings are equal if they contain all the same characters) can be different objects residing in different places in memory.
In addition to strings, Smalltalk has a class of character sequence objects called Symbol. Symbols are guaranteed to be unique—there can be no two equal symbols which are different objects. Because of that, symbols are very cheap to compare and are often used for language artifacts such as message selectors (see below).
Symbols are written as # followed by a string literal. For example:
If the sequence does not include whitespace or punctuation characters, this can also be written as:
#(1 2 3 4)
defines an array of four integers.
Many implementations support the following literal syntax for ByteArrays:
#[1 2 3 4]
defines a ByteArray of four integers.
And last but not least, blocks (anonymous function literals)
[... Some smalltalk code...]
A block of code (an anonymous function) can be expressed as a literal value (which is an object, since all values are objects.) This is achieved with square brackets:
[ :params | <message-expressions> ]
Where :params is the list of parameters the code can take. This means that the Smalltalk code:
[:x | x + 1]
[:x | x + 1] value: 3
can be evaluated as
f(3) = 3 + 1
The two kinds of variables commonly used in Smalltalk are instance variables and temporary variables. Other variables and related terminology depend on the particular implementation. For example, VisualWorks has class shared variables and namespace shared variables, while Squeak and many other implementations have class variables, pool variables and global variables.
Temporary variable declarations in Smalltalk are variables declared inside a method (see below). They are declared at the top of the method as names separated by spaces and enclosed by vertical bars. For example:
| index |
declares a temporary variable named index. Multiple variables may be declared within one set of bars:
| index vowels |
declares two variables: index and vowels.
A variable is assigned a value via the ':=' syntax. So:
vowels := 'aeiou'
Assigns the string 'aeiou' to the previously declared vowels variable. The string is an object (a sequence of characters between single quotes is the syntax for literal strings), created by the compiler at compile time.
In the original Parc Place image, the glyph of the underscore character (_) appeared as a left-facing arrow (like in the 1963 version of the ASCII code). Smalltalk originally accepted this left-arrow as the only assignment operator. Some modern code still contains what appear to be underscores acting as assignments, hearkening back to this original usage. Most modern Smalltalk implementations accept either the underscore or the colon-equals syntax.
The message is the most fundamental language construct in Smalltalk. Even control structures are implemented as message sends. Smalltalk adopts by default a synchronous, single dynamic message dispatch strategy (as contrasted to the asynchronous, multiple dispatch strategy adopted by some other object-oriented languages).
The following example sends the message 'factorial' to number 42:
In this situation 42 is called the message receiver, while 'factorial' is the message selector. The receiver responds to the message by returning a value (presumably in this case the factorial of 42). Among other things, the result of the message can be assigned to a variable:
aRatherBigNumber := 42 factorial
"factorial" above is what is called a unary message because only one object, the receiver, is involved. Messages can carry additional objects as arguments, as follows:
2 raisedTo: 4
In this expression two objects are involved: 2 as the receiver and 4 as the message argument. The message result, or in Smalltalk parlance, the answer is supposed to be 16. Such messages are called keyword messages. A message can have more arguments, using the following syntax:
'hello world' indexOf: $o startingAt: 6
which answers the index of character 'o' in the receiver string, starting the search from index 6. The selector of this message is "indexOf:startingAt:", consisting of two pieces, or keywords.
Such interleaving of keywords and arguments is meant to improve readability of code, since arguments are explained by their preceding keywords. For example, an expression to create a rectangle using a C++ or Java-like syntax might be written as:
new Rectangle(100, 200);
It's unclear which argument is which. By contrast, in Smalltalk, this code would be written as:
Rectangle width: 100 height: 200
The receiver in this case is "Rectangle", a class, and the answer will be a new instance of the class with the specified width and height.
Finally, most of the special (non-alphabetic) characters can be used as what are called binary messages. These allow mathematical and logical operators to be written in their traditional form:
3 + 4
which sends the message "+" to the receiver 3 with 4 passed as the argument (the answer of which will be 7). Similarly,
3 > 4
is the message ">" sent to 3 with argument 4 (the answer of which will be false).
Notice, that the Smalltalk-80 language itself does not imply the meaning of those operators. The outcome of the above is only defined by how the receiver of the message (in this case a Number instance) responds to messages "+" and ">".
A side effect of this mechanism is operator overloading. A message ">" can also be understood by other objects, allowing the use of expressions of the form "a > b" to compare them. Expressions
An expression can include multiple message sends. In this case expressions are parsed according to a simple order of precedence. Unary messages have the highest precedence, followed by binary messages, followed by keyword messages. For example:
3 factorial + 4 factorial between: 10 and: 100
is evaluated as follows:
3 receives the message "factorial" and answers 6 4 receives the message "factorial" and answers 24 6 receives the message "+" with 24 as the argument and answers 30 30 receives the message "between:and:" with 10 and 100 as arguments and answers true
The answer of the last message sent is the result of the entire expression.
Parentheses can alter the order of evaluation when needed. For example,
(3 factorial + 4) factorial between: 10 and: 100
will change the meaning so that the expression first computes "3 factorial + 4" yielding 10. That 10 then receives the second "factorial" message, yielding 3628800. 3628800 then receives "between:and:", answering false.
Note that because the meaning of binary messages is not hardwired into Smalltalk-80 syntax, all of them are considered to have equal precedence and are evaluated simply from left to right. Because of this, the meaning of Smalltalk expressions using binary messages can be different from their "traditional" interpretation:
3 + 4 * 5
is evaluated as "(3 + 4) * 5", producing 35. To obtain the expected answer of 23, parentheses must be used to explicitly define the order of operations:
3 + (4 * 5)
Unary messages can be chained by writing them one after another:
3 factorial factorial log
which sends "factorial" to 3, then "factorial" to the result (6), then "log" to the result (720), producing the result 2.85733.
A series of expressions can be written as in the following (hypothetical) example, each separated by a period. This example first creates a new instance of class Window, stores it in a variable, and then sends two messages to it.
| window | window := Window new. window label: 'Hello'. window open
If a series of messages are sent to the same receiver as in the example above, they can also be written as a cascade with individual messages separated by semicolons:
Window new label: 'Hello'; open
This rewrite of the earlier example as a single expression avoids the need to store the new window in a temporary variable. According to the usual precedence rules, the unary message "new" is sent first, and then "label:" and "open" are sent to the answer of "new".