- Chapter 2: pairwise alignment
- Chapter 3: hidden Markov models
- Chapter 7: phylogenetic trees
- Chapter 9: transformational grammars

The Dina Research School workshop pages: alignment algorithms and phylogeny using Phylip.

- A Perl implementation of algorithms for pairwise alignment with graphic display of the alignment (local/global and linear/affine gap costs).
- The algorithms for pairwise alignment as an applet that you can run in your local browser.
- The source code for the applet alignment algoritms (in Java).
- The additional applet source code for the alignment applet
- Some seminar slides presenting the four basic algorithms for pairwise alignment

- NWSimple
- Global alignment with simple linear gap costs, using the Needleman-Wunsch algorithm.
- SWSimple
- Local alignment with simple linear gap costs, using the Smith-Waterman algorithm.
- RMSimple
- Repeated matches with simple linear gap costs.
- OMSimple
- Overlap matches with simple linear gap costs.
- NWAffine
- Global alignment with affine gap costs, using the Needleman-Wunsch algorithm.
- NWSmart
- Global alignment with simple linear gap costs. Where all the
above implementations require space proportional to the
*product*of the sequence lengths, this one requires only space proportional to the*sum*of the sequence lengths. It reconstructs the full alignment using a recursive divide-and-conquer algorithm, calling NWSimple to solve the base cases (when one of the sequences have length 1). - SWSmart
- Local alignment with simple linear gap costs, using the smart linear space algorithm (as in NWSmart).
- AlignSmartAffine
- Global alignment with affine gap costs, using the smart linear space algorithm (as in NWAffine and NWSmart).
- SWSmartAffine
- Local alignment with affine gap costs, using the smart linear space algorithm (as in AlignSmartAffine and SWSmart).

The implementation is modular so that new scoring matrices can be used (currently it uses the BLOSUM50 scoring matrix) and new alignment algorithms can be developed with little effort.

Unless you are familiar with object-oriented programming languages, the modularity is likely to make the details of the program hard to understand. The use of a higher-order functional language, such as Standard ML, might have given a clearer and more concise program. Such languages provide better procedural abstraction than do object-oriented languages.

The `simple' and `affine' algorithms painstakingly build an array of Traceback objects while computing the optimal score. This provides for algorithmic clarity and simplifies the reconstruction of the optimal match. It may seem rather wasteful in terms of space, but eventually the `simple' and `affine' algorithms will be used only to handle the base cases when computing the traceback in a `smart' algorithm. This means that the total size of the traceback arrays allocated will be only linear in the sum of the sizes of the two sequences.

- The source code for the hidden Markov model (HMM) algorithms

- HMM
- The class of hidden Markov models. An HMM object includes state names, a matrix of transition probabilities, emission names, and a matrix of emission probabilities
- Viterbi
- Uses Viterbi decoding to compute the most probable path through an HMM for a for a given observed sequence.
- Forward
- Uses the so-called Forward algorithm on an HMM to compute the likelihood of a given observed sequence, and to build the forward table.
- Backward
- Uses the so-called Backward algorithm on an HMM to compute the likelihood of a given observed sequence, and to build the backward table.
- PosteriorProb
- Computes the posterior probability of states in a HMM from given Forward and Backward tables. As in part 3 of the casino example we computed the posterior probability of the die being fair: Click to see the graph
- baumwelch
- Uses the Baum-Welch algorithm to estimate the parameters of an HMM when given its structure (number of states and emissions) and given an observed sequence.

- The algorithms for phylogenetic trees as an applet that you can run in your local browser
- The source code for generating and drawing phylogenetic trees
- The additional applet source code for a graphical user interface to the above program

- UPCluster
- The class of rooted trees as constructed by the UPGMA algorithm
- UPGMA
- The UPGMA algorithm for constructing rooted phylogenetic trees. The algorithm takes as input the lower triangle of a symmetric distance matrix and contructs a rooted tree in the form of a UPCluster object. The algorithm does not check that the distance matrix satisfies the triangle inequality (i.e., defines a metric), nor that it is an ultrametric (which is a requirement for proper application of the UPGMA algorithm) or additive.
- NJCluster
- The class of rooted trees as constructed by the neighbour joining algorithm.
- NJ
- The neighbour joining algorithm for constructing phylogenetic trees. The algorithm takes as input the lower triangle of a symmetric distance matrix and contructs a rooted tree in the form of a NJCluster object. Really, the algorithm should construct an unrooted tree, but this version arbitrarily adds a root node with appropriate edges to the last two active leaves in the construction. The algorithm does not check that the distance matrix satisfies the triangle inequality (i.e., defines a metric), nor that it is an ultrametric or additive (which is a requirement for proper application of the neighbour joining algorithm).
- The demo applet
- This applet permits interactive experimentation with distance matrices and the resulting UPGMA and neighbour joining trees. The applet starts up with the example in Figure 7.7 on page 170 of Durbin et al. (for which the tree reconstructed by UPGMA is wrong, whereas that reconstructed by neighbour joining is correct). The applet permits interactive editing of the distance matrix, which may have arbitrary dimension. Pressing a button may fill the distance matrix with a random data set. The data set satisfies the triangle inequality -- in fact, is it a two-dimensional Euclidean distance -- but it may not define an ultrametric nor be additive.

- Some seminar slides presenting parts of the chapter on transformational grammars

Peter Sestoft (sestoft@itu.dk) 1999-10-01, 1999-12-19