Concurrent with, but mainly independent of, the design of the SML Basis Library has been work by the authors of the SML language to revise the language [CITE]definition/. In addition to simplifying and clarifying certain aspects of the original definition, the revision includes modest changes to the language that affect the programmer's use of the language and that address issues raised during the design of the library. For example, the revised language supports character literals, which greatly extends the expressiveness of the library's character types.
This chapter discusses the most significant of these changes, at least from the library's viewpoint. In addition, it describes in passing the changes concerning imperative/weak types and structure sharing, and notes incompatibilities between the current library proposal and the initial basis described in the original definition. A complete and authoritative discussion of the language changes is given, of course, in the revised [CITE]definition/.
The new character type and the possibility of multiple implementations of the numeric types requires addressing the issue of literals.
The revised definition extends the allowed escape sequences for characters to include:
\a Alert (ASCII 0x07) \b Backspace (ASCII 0x08) \v Vertical tab (ASCII 0x0B) \f Form feed (ASCII 0x0C) \r Carriage return (ASCII 0x0D) \uxxxx The character whose encoding is the number xxxx consisting of four hexadecimal digits.
There is additional notation for character literals:
#"c"where c is any legal string representing a single character. This notation has the advantage that existing legal SML code will not be affected.
Hexadecimal integer constants are part of the revised language. Hexadecimal literals have the notation:
The language supports word types, i.e., nonnegative integers with modular arithmetic corresponding to machine words. The revised definition provides decimal and hexadecimal word literals. Word literals will have a ``
0w'' prefix; for example:
0wxFF. Word literals do not have a sign.
The specification of real literals has been relaxed to allow either
`e' for the exponent.
With the possibility of multiple representations of the basic types in a given implementation (e.g.,
LargeInt), it is convenient to be able to resolve literals to various specific types without the programmer having to supply specific type information. The revised definition specifies that literals are viewed as overloaded symbols that, in the lack of additional type information, are given a default representation. Thus, the top-level binding
val x = 1would give
val x = (1 : LargeInt.int) val x : LargeInt.int = 1would both give
LargeInt.int. In addition, if
LargeInt.int -> unit, the expression
f 1would typecheck.
In general, without additional implicit or explicit type constraints, integer literals default to type
int, word literals become type
word, real literals become type
real, string literals become type
string, and character literals become type
Note that, after overload resolution has determined a specific representation, literals out of range of that representation should be detected at compile time.
In addition to overloaded literals, the revised language continues to allow overloading on a restricted set of identifiers. These identifiers include the standard arithmetic and relational operators. A complete list is given in Chapter 3. As with literals, the value identifiers have a default type that is adopted in lieu of any type information supplied by the surrounding context. All overloaded value identifiers default to an
int-based type except for the operator
/, whose default type is
real * real -> real. Thus, the following code would typecheck:
fun f(x,y) = x <= y val x = (1 : LargeInt.int) val y = x + 1 fun g x = x + x before ignore (x + 0w0)with
int * int -> bool,
word -> word, respectively.
As is well-known, imperative features such as
ref and polymorphism cannot be combined naively without compromising type safety. Attempts to deal with this problem, using imperative type variables or weak types, have proven unsatisfactory, both because they are complex and unintuitive, and because they violate abstraction by exposing the pure or imperative nature of a computation in its type.
The revised definition of SML adopts value polymorphism to solve this problem. Specifically, in the expression
let val x = e in e' end
xis given a polymorphic type only if e is a syntactic value, i.e., e is a constant, a variable, a lambda expression, or a record, tuple or non-
refdatatype value whose component parts are all syntactic values. This solution is not upward-compatible, in that certain expressions that are valid in SML will no longer type check. However, there is evidence that this solution is quite viable in practice. Most SML programs already restrict polymorphism to values and in most cases where non-value polymorphism is used, value polymorphism can be introduced by a small syntactic change. Given the enormous simplification this change effects, value polymorphism seems like the right solution.
The original definition specified a very restrictive meaning to structure sharing. While retaining the original definition of type sharing, the revised definition reinterprets structure sharing as an abbreviation for a collection of type sharing specifications on the common type names among the specified structures.
Previously, types could occur in signatures only as a simple name or as a datatype definition. Although there are technical reasons for this decision, in practice this is too restrictive. In the future, type abbreviations can occur in signatures as well as structures. There is also a where type notation, which allows a programmer to extend a signature by adding definitions for its type components.
To increase abstraction, it will be possible to match structures against signatures such that, unless the signature specifies the definition of a type as a datatype or a type abbreviation, the representation of the type is hidden outside of the structure.
The boolean constructors
false, the list constructors
::, and the reference constructor
ref are treated specially. They are bound at top-level in the initial environment as datatype constructors, and cannot be rebound. Effectively, this makes them additional keywords, though technically they could be used as names for types, signatures, structures or functors. Note, in addition, that the bool and list types are defined at top-level and not in any module.
The SML Basis Library is largely a conservative extension of the basis described in the original definition, but there are a few points of incompatibility worth noting:
Further information on the differences between the two bases can be found in the SML90 structure.
Last Modified February 20, 1997
Comments to John Reppy.
Copyright © 1997 Bell Labs, Lucent Technologies