Attempt at solution to the SSDBS 1999 interval exercise (see mail from Carsten 13 December 1999 at http://www.itu.dk/people/sestoft/ssdbs1999.txt) Problem: Given a table IT of intervals denoting use of some resource, find all unoccupied intervals within a given range [b,e]. First truncate all the given intervals (table IT) using this query: DELETE FROM IT1; INSERT INTO IT1 SELECT GREATEST(b, x.b) AS b, LEAST(e, x.e) AS e FROM IT x WHERE b <= x.e AND x.b <= e; Example with b=1500, e=7000: 1. Initialization from raw table IT: DELETE FROM IT1; INSERT INTO IT1 SELECT GREATEST(1500, x.b) AS b, LEAST(7000, x.e) AS e FROM IT x WHERE 1500 <= x.e AND x.b <= 7000; 2. Repeat the operations below until stabilization: DELETE FROM IT2; INSERT INTO IT2 SELECT x.b AS b, MAX(x.e) AS e FROM IT1 x GROUP BY x.b; DELETE FROM IT3; INSERT INTO IT3 SELECT MIN(x.b) AS b, x.e AS e FROM IT2 x GROUP BY x.e; DELETE FROM IT4; INSERT INTO IT4 SELECT x.b AS b, MAX(y.e) FROM IT3 x, IT3 y WHERE x.b <= y.b AND y.b <= x.e GROUP BY x.b; DELETE FROM IT1; INSERT INTO IT1 SELECT MIN(x.b) AS b, y.e FROM IT4 x, IT4 y WHERE y.b <= x.e AND x.e <= y.e GROUP BY y.e; 3. Finally, extract free periods as interval complement: SELECT x.e AS b, MIN(y.b) AS e FROM IT3 x, IT3 y WHERE x.e < y.b GROUP BY x.e; SELECT b AS b, MIN(x.b) AS e FROM IT3 x WHERE b < (SELECT MIN(b) FROM IT3); SELECT MAX(x.e) AS b, e AS e FROM IT3 x WHERE (SELECT MAX(e) FROM IT3) < e; The basic idea is to (1) restrict the collection of intervals to their intersection with the query range [b,e], then (2) replace each maximal set of overlapping intervals by its union, which is itself an interval, and (3) compute the complement of the set of unions -- here the intervals near the end of the query require separate queries, which could probably be improved. One could consider this a problem on undirected graphs, in which a node is an interval and there is an edge between two intervals if they overlap. The purpose of step (2) is to compute the symmetric transitive closure of the edge relation. If there are n nodes and the symmetric edge relation is represented by an incidence matrix A, then the transitive closure is the matrix power A^n, which can be computed by log(n) matrix multiplications, each taking time at most O(n^3). Likewise at most log(n) iterations of the SQL queries in step (2) are needed; each of these takes time O(n^2) for the self-join, it would seem -- a bit mysterious why this is faster than the matrix power method? Peter 2013-12-21