
The ML Kit with Regions
Version 3
This is: http://www.itu.dk/research/mlkit/kit3/readme.html
The ML Kit with Regions, Version 3
The ML Kit with Regions is a compiler for the programming
language Standard ML. The ML Kit with Regions
is available in two variants:
one that generates native code for the
HP PA-RISC architecture and one that
generates ANSI C code.
Main novelties:
|
New Version |
This page describes The ML Kit with Regions Version 3, henceforth referred to as The Kit. |
|
SML 97 |
The Kit now covers all of Standard ML, as defined in the 1997 edition of the Definition of Standard ML . |
|
Modules |
In particular, the Kit is now able to compile Modules, using a new compilation scheme, which we call static interpretation. Static interpretation comes with a system for ``smart recompilation''. |
|
Basis Library |
Also, the Kit now supports (most of) the Standard ML Basis Library. |
|
Compiles Large Programs |
The largest program compiled with this version is the ML Kit itself (around
80.000 lines of SML, plus the Basis Library).
The second largest is Hafnium's
AnnoDomini,
which is around 58.000 lines of SML.
|
|
|
|
Other features of the ML Kit are:
|
Region-Based Memory Management |
All memory allocation directives (both allocation and de-allocation) are inferred by the compiler, which uses a number of program analyses concerning lifetimes and storage layout. There is no pointer-tracing garbage
collector in the Kit. (The ML Kit is unique among ML implementations in this
respect.) |
|
Memory Safety |
Safety of de-allocation of memory is ensured by the compiler.
Region Inference is sometimes able to de-allocate memory into which
there are still pointers but only in cases where these pointers will
never be dereferenced.
|
|
Static detection of space leaks |
The Kit issues warnings when Region Inference reveals a possible space leak. |
|
Region Profiling |
The system includes a graphical region profiler, which helps gain detailed control over memory use. |
|
Good for Real-Time |
Programmers who are interested in real-time programming can exploit the absence of garbage collection: there are no interruptions of unbounded duration at runtime. |
|
Interface to C |
ML Kit applications can call C functions using standard C calling conventions; the region scheme can even take care of allocating and de-allocating regions used by C functions thus invoked. |
| |
|
The ML Kit: History, Goals and Approach
The development of the ML Kit began in 1989 at Edinburgh University.
Originally, the project had two purposes:
To provide an implementation
that is consistent with the language definition;
To provide a service
to the research community by providing a highly modular system, parts
of which can be reused by other compiler writers.
These goals are still intact. Specifically, in order to facilitate
code re-use and reliability of the front-end of the Kit, we have
re-written the Match compiler (using the same approach as Moscow ML uses); and
cleaned up the implementation of the static semantics of the
Core language (the StatObject module).
Since the work on region-based memory management
started in the Kit (in 1994), goals specific to region-based memory management
have been added:
To provide an implementation that could provide ML
programmers with a high degree of control over
memory resources;
To push the compilation
technology for regions to the point where programs could be compiled
at a tolerable speed and result in fast-running programs.
Version 2 of the Kit did provide good control over memory resources
for Core ML programs, but it did not compile all of Standard ML and
the compilation was very slow. Version 3 compiles all of SML and
considerable effort has been devoted to tuning the system. Compiling
AnnoDomini currently takes about 1.5 hours on an HP 900 S700, so we
cannot boast of fast compilation. However, the general approach we
take in the Kit is to try to get the functionality right first and
then gradually replace inefficient data structures and algorithms with
better ones. A nice side-effect of this strategy is that the Kit
contains more and more re-usable modules that implement classical
data structures and algorithms from the literature. Examples include
sorting, union-find, Patricia trees and directed graphs (strongly
connected components etc).
Current work involves experimentation with combinations of garbage
collection and region inference, plus the development of a native code
backend for Intel's x86 architecture.
What does the Distribution Contain?
Revised and extended User's Guide: Tofte,
Birkedal, Elsman, Hallenberg, Højfeld Olesen, Sestoft and
Bertelsen: Programming with Regions in the ML Kit.
Source code
to the Kit, including runtime system, profiling tools, and smart
recompilation of modules.
Installation instructions
Support and Bugs
If you find a bug, please report it via email
to Niels Hallenberg
or Mads Tofte, including a (preferably
small) test program which demonstrates the problem.
Acknowledgments
The ML Kit with Regions is a software deliverable of the DART research project, which is sponsored by the Danish Research Council for Natural Sciences.

Back to The ML Kit Project Home Page (Papers, People, Tutorials, ...)
Maintained by tofte@itu.dk / Last revised: December 14, 1999.