IT University of Copenhagen

Project and thesis proposals in algorithms

This page contains some proposals for projects and theses in the area of algorithms and data structures. The projects range from 4-week projects that are suitable for students who have taken an introductory programming course to thesis projects for students with at least one advanced algorithms course, including Advanced Database Technology.

For most proposals it is possible to emphasize either theoretical or practical aspects. The proposals are only sketched - contact Rasmus Pagh (pagh@it-c.dk) or Anna ÷stlin Pagh (annao@itu.dk) for more information, or the person(s) responsible for the project. Variants of the projects or other proposals are welcome.

We recommend project groups of at least two, preferably three or four, members, as the amount of supervision allocated by ITU to singleton groups is rather small. Thesis projects, though, are often singleton groups.

4-week projects offered in December 2005

Search Engine Project

7.5 ECTS project on implementing a high-performance search-engine (think Google).

The project is suitable for students at almost any level and of almost any ambition; the only requirement is basic programming skills, as acquired in one of the courses GP, IPBR, OOP and/or IADS. Simply by choosing the right ambitions, the beginning programmer will find an opportunity to implement a manageable yet real program; whereas the advanced student will find use for all of his/her knowledge and ingenuity.

If you are interested in this project, contact the teachers Esben Rune Hansen esben@itu.dk or Milan Ruzic milan@itu.dk, or have a look at the project homepage where much more information can be found: http://www.itu.dk/research/theory/SearchEngine/E2005/

Hex game project

The game Hex is a board game for two players, who take turns to place a red, respectively, a blue piece on a board consisting of hexagonals. The goal for player red is to build a red path from left to right, and the goal for player blue is to build a blue path from top to bottom.

The goal of this project is to implement this game, using tools from Introduction to algorithms and data structures (IADS). In this project there will be possibilities to use, e.g., search algorithms and graph algorithms.

Contact Esben Rune Hansen esben@itu.dk or Milan Ruzic milan@itu.dk if you are interested in this project.

Fun with asymptotics through hashing

The aim of this project is to implement several simple hashing algorithms, some of which have been proposed in very recent research, and make an experimental evaluation. In particular, the experiments should shed light on the asymptotic behavior when the hash table begins to fill up.

Contact Rasmus Pagh (pagh@itu.dk) if you are interested in this project.

Thesis project proposals

A thesis project in algorithms can range from theoretical to experimental, and in many cases it is both. A typical project is to look at a piece of new algorithmic research, understand it, implement (part of) it, and value its practical performance, e.g., by comparing to previous soultions.

To do an algorithmic thesis project an introductory course in algorithms and data structures (e.g. IADS) is required and an advanced course in algorithms is recommended.

Below you find a list of suggestions for thesis projects. You are also wellcome to come up with your own suggestions. Contact one of the avaliable supervisors and we can meet and talk about it.

Supervisors

Computing joins in databases

The join operation of relational algebra is a cornerstone of relational database systems. While computing the join of two relations is well-understood, there are no good methods in general for joining many relations. This project aims at investigating the weaknesses of current methods (exhibiting bad cases for current DBMSs), and experimenting with a new approach to multiple joins due to Pagh and Pagh.

Buffered indexes

This project investigates the possibilities for augmenting existing types of database indexes with buffers to speed up updates. Depending on the complexity of the index type chosen, emphasis could be on theory or implementation and experiments. Two concrete possibilities are to work with buffered B-trees or a new hashing technique by Jensen and Pagh.

Filtering - with applications in databases and distributed systems

Suppose you want to be able to look up whether one of your friends has a particular file on her computer, but you don't want to store the list of all her files. What do you do? The answer is that you can store a filter, i.e., some information (much smaller than the file list) that allows you to find the right answer with probability, say, 99%.

This project investigates and compares methods for constructing filters, including a new method proposed by Pagh, Pagh, and Rao, which has the potential of outperforming existing methods. Filters have several applications in distributed systems and databases, and a thesis project could also look at one or more of those.

Reference:
An Optimal Bloom Filter Replacement, by Pagh, Pagh, and Rao.

Adaptive sorting - with applications in databases

Often there will be some structure in the order of relations that may be exploited when sorting according to one or more attributes. This project considers such adaptive external sorting algorithms that perform better on inputs that are "partly sorted", and their application to sort join.

Reference:
On Adaptive Integer Sorting, by Pagh, Pagh, and Thorup.

Using multiple disks efficiently

This project considers recent randomized algorithms for using multiple disks very efficiently - combining the throughput and space efficiency of striping with the flexibility of mirroring.

References:
Fast Concurrent Access to Parallel Disks, by Sanders, Egner, and Korst.
Reconciling Simplicity and Realism in Parallel Disk Models, by Sanders.