Add SQL Server Engine Support for Interval Queries (Intersection/Overlap and other Allen’s interval algebra relations)
3/5/2013 7:07:54 PM
An interval is a set of values between a certain lower value and a certain upper value, where upper >= lower. In reality there are different kinds of intervals that you may need to represent in your database like temporal ones (sessions, appointments, periods of validity), and others.
When in need to represent temporal intervals in SQL Server most use two attributes holding the lower and upper points in time. Then there are operations that you need to apply between intervals. One of the most common operations is to check which intervals overlap with an input interval. For example: “return all contracts that were active during an input period represented by the inputs @lower and @upper.”
Some of the classic querying methods that are used to provide answers to such requests suffer from fundamental optimization problems. The result is that typical interval queries tend to be very slow and inefficient.
An efficient solution that can be implemented today in SQL Server does exist. It is based on academic work that resulted in a model called Relational Interval tree (RI-tree) by Kriegel, Pötke and Seidl, plus further optimizations called static RI-tree by Laurent Martin. However, I believe that the model and its optimizations are a bit complex for people to comprehend.
The resources below describe the existing optimization problems, the solution that can be applied today and its complexities, and the potential for improvements in SQL Server.
Essentially the suggested improvement in SQL Server is to add internal engine support for the RI-tree model. Such support would include a new type of index (call it interval index) and related optimizer improvements. The user would only be required to use basic syntax for creating the interval index. Behind the scenes SQL Server would create indexing based on the RI-tree model. Also, the optimizer would detect classic interval query patterns based on the predicates used in the query filters. And when detected, the optimizer would use the internal indexes to process the queries efficiently. One of the main benefits in this approach is that existing queries don’t need to be changed. They would just run faster.
Note that this proposal is independent of adding declarative SQL support for interval related queries, which would be a good thing as well. The engine part could still be based on the RI-tree model.
• Deck explaining problem and suggested solution: http://tsql.solidq.com/books/source_code/Intervals.pdf
• Source code accompanying deck: http://tsql.solidq.com/books/source_code/Intervals.txt
• Paper describing Relational Interval tree (RI-tree) model – [RI] “Managing Intervals Efficiently in Object-Relational Databases” (Kriegel, Pötke and Seidl of University of Munich): http://www.dbs.ifi.lmu.de/Publikationen/Papers/VLDB2000.pdf
• Articles describing optimizations to original model (static RI-tree) – [SRI1] “A Static Relational Interval Tree” (Laurent Martin): http://www.solidq.com/sqj/Pages/2011-September-Issue/A-Static-Relational-Interval-Tree.aspx and [SRI2] “Advanced interval queries with the Static Relational Interval Tree” (Laurent Martin): https://www.solidq.com/sqj/Pages/Relational/Advanced-interval-queries-with-the-Static-Relational-Interval-Tree.aspx