ML Kit

ML Kit

 

 

The ML Kit with Regions

Version 3

 

 

 


This is: http://www.itu.dk/research/mlkit/kit3/readme.html


The ML Kit with Regions, Version 3

The ML Kit with Regions is a compiler for the programming language Standard ML. The ML Kit with Regions is available in two variants: one that generates native code for the HP PA-RISC architecture and one that generates ANSI C code.

Main novelties:

New Version

This page describes The ML Kit with Regions Version 3, henceforth referred to as The Kit.

SML 97

The Kit now covers all of Standard ML, as defined in the 1997 edition of the Definition of Standard ML .

Modules

In particular, the Kit is now able to compile Modules, using a new compilation scheme, which we call static interpretation. Static interpretation comes with a system for ``smart recompilation''.

Basis Library

Also, the Kit now supports (most of) the Standard ML Basis Library.

Compiles Large Programs

The largest program compiled with this version is the ML Kit itself (around 80.000 lines of SML, plus the Basis Library). The second largest is Hafnium's AnnoDomini, which is around 58.000 lines of SML.

 

 

Other features of the ML Kit are:

Region-Based Memory Management

All memory allocation directives (both allocation and de-allocation) are inferred by the compiler, which uses a number of program analyses concerning lifetimes and storage layout. There is no pointer-tracing garbage collector in the Kit. (The ML Kit is unique among ML implementations in this respect.)

Memory Safety

Safety of de-allocation of memory is ensured by the compiler. Region Inference is sometimes able to de-allocate memory into which there are still pointers but only in cases where these pointers will never be dereferenced.

Static detection of space leaks

The Kit issues warnings when Region Inference reveals a possible space leak.

Region Profiling

The system includes a graphical region profiler, which helps gain detailed control over memory use.

Good for Real-Time

Programmers who are interested in real-time programming can exploit the absence of garbage collection: there are no interruptions of unbounded duration at runtime.

Interface to C

ML Kit applications can call C functions using standard C calling conventions; the region scheme can even take care of allocating and de-allocating regions used by C functions thus invoked.

   

The ML Kit: History, Goals and Approach

The development of the ML Kit began in 1989 at Edinburgh University. Originally, the project had two purposes:

  To provide an implementation that is consistent with the language definition;

  To provide a service to the research community by providing a highly modular system, parts of which can be reused by other compiler writers.

These goals are still intact. Specifically, in order to facilitate code re-use and reliability of the front-end of the Kit, we have

  re-written the Match compiler (using the same approach as Moscow ML uses); and

  cleaned up the implementation of the static semantics of the Core language (the StatObject module).

Since the work on region-based memory management started in the Kit (in 1994), goals specific to region-based memory management have been added:

  To provide an implementation that could provide ML programmers with a high degree of control over memory resources;

  To push the compilation technology for regions to the point where programs could be compiled at a tolerable speed and result in fast-running programs.

Version 2 of the Kit did provide good control over memory resources for Core ML programs, but it did not compile all of Standard ML and the compilation was very slow. Version 3 compiles all of SML and considerable effort has been devoted to tuning the system. Compiling AnnoDomini currently takes about 1.5 hours on an HP 900 S700, so we cannot boast of fast compilation. However, the general approach we take in the Kit is to try to get the functionality right first and then gradually replace inefficient data structures and algorithms with better ones. A nice side-effect of this strategy is that the Kit contains more and more re-usable modules that implement classical data structures and algorithms from the literature. Examples include sorting, union-find, Patricia trees and directed graphs (strongly connected components etc).

Current work involves experimentation with combinations of garbage collection and region inference, plus the development of a native code backend for Intel's x86 architecture.

What does the Distribution Contain?

  Revised and extended User's Guide: Tofte, Birkedal, Elsman, Hallenberg, Højfeld Olesen, Sestoft and Bertelsen: Programming with Regions in the ML Kit.

  Source code to the Kit, including runtime system, profiling tools, and smart recompilation of modules.

  Installation instructions

Download Distribution

Support and Bugs

If you find a bug, please report it via email to Niels Hallenberg or Mads Tofte, including a (preferably small) test program which demonstrates the problem.

Acknowledgments

The ML Kit with Regions is a software deliverable of the DART research project, which is sponsored by the Danish Research Council for Natural Sciences.

Back to The ML Kit Project Home Page (Papers, People, Tutorials, ...)

 

Maintained by tofte@itu.dk / Last revised: December 14, 1999.