|
|
|
|
|||
Introduction to Algorithms and Data Structures, Fall 2005, Episode 2 | |||
|
|
|
||
|
|
Lecture, September 7Subject : Growth rates of functions. Theta-, Big-Oh-, and Omega-notation. Divide & Conquer technique and MergeSort, Heaps, Max-Heaps, and Heap-Sort, Priority Queues.Text : Chapter 3. Chapter 2.3 (skip 2.3.2 you will get a note on this later), page 123-126, and Chapter 6. Prerequisites : Basic time analysis. Basic concepts - algorithm, data structure, pseudocode etc. from episode 1. Comments : Focus in chapter 3 should be on the Big-Oh-notation. For Merge-Sort, it is important that you understand how this algorithm can be defined recursively. Think about why the parent, left and right functions of heaps implemented with arrays on page 128 is correct. On page 132 skip the text about "master theorem". Instead express the complexity using the height h. Section 6.3 is not essential for understanding the following chapters. This section includes some math that will probably make more sense after the lecture. If you do not have a lot of math experience you should save 6.3 for later. Time : Wednesday, September 7, 9:00–12:00 Room : 4A14 Tutorial : 13:00–16:00, 2A14 Assignments
The mandatory assigments (M) should be placed in your teaching assistent box or be sent by e-mail to Maria Mochnacs, makmo@itu.dk. Programming assignmentsAssignment 2.1This assignment deals with the performance of Heap-Sort. We should implement the algorithm first, then evaluate its performance experimentally. Start with implementing Heapify and Build-Heap and test these procedures on data before
implementing Heap-Sort. You can use this code as a starting point if you like.
Warning: In Java the elements in a table Use your program to measure the performance of sorting 1000, 2000, 5000, and 10000 integers. Compare the results with the results from the Insertion-Sort program from last weeks' assignment.
Assignment 2.2One of the major features of Merge-Sort is that it manipulates objects close to each other in sequences, without moving them back and forth. This feature makes Merge-Sort suitable for sorting data on external memory (for example hard disks), where access to closely located elements in file order is significantly faster. In this assignment we use Merge-Sort to sort data, which would not easily fit into memory of most of PCs. File 1GB.txt.zip contains 10 000 000
random character strings, each 100 characters long (Warning the
file has 611 MB). The file format is as follows. The first line
contains a single number, stating how many objects follow (hence it
says "10000000" in our file). Then each line contains a random
sequence of 100 characters. Note that in Java you can use
Implement a function It is convenient to test the implementation on smaller sets, so you should parameterize the program with the chunk size. You can experiment with various chunk sizes, to increase efficiency. Also you can use the following smaller files: 1K.txt.zip (10 entries), 10K.txt.zip (100 entries) and 100K.txt.zip (1000 entries). Rationale: This assignment is "realistic"—it has been based on one of the previous teachers' experience from a software development project in the electronic data interchange industry. In this project the chunks of data were messages coming in large numbers from the network of banking systems. Note: In case you lack experience, Peter Sestoft's note Text Files in Java introduces handling of text files in Java. Alternatively see section 21.7 in Peter's book Java Precisely. Note: You do not really need gigabytes of disk space to solve this exercise. It is sufficient (and more feasible) to work on smaller files. Although bear in mind that the implementation should be designed in the way that it should be able to handle huge files, given you had enough disk resources. At the same time it should operate using a relatively modest amount of volatile memory. Note: The example files have been generated by the generate
program (statically linked Linux binary, source in Standard ML). Type
|