I’ve been teaching a course in advanced data structures (Stanford’s CS166) for many years now and in the course of doing so have learned some fascinating techniques. One of my favorites is the data structure isometry, where we use the shape of one data structure to guide the design of another. The most prominent example of this is how the red/black tree can be viewed as a representation of a 2-3-4 tree by pulling all the red nodes up into their parents. In this quarter’s offering of CS166, I explicitly taught how to derive red/black tree rotations by thinking about what the equivalent 2-3-4 tree would look like.
In one of the early offerings of CS166 (I think it may have been the inaugural 2014 edition?) one of the project teams presented on a method for making deterministic skiplists. The idea is similar to deriving the red/black tree:
- Find a mechanism for encoding a general multiway tree as a skiplist.
- Take the rules for a 2-3-4 tree and, one at a time, convert them into structural invariants on the skiplist.
- Based on the encoding scheme, map the insertion and deletion algorithms of 2-3-4 trees onto the new skiplist.
This procedure gives you a deterministic skiplist where each operation has (amortized) costs of $$. The actual details aren’t particularly tricky, and I’ve been using this as a problem set exercise in CS166 for the past few years.
The other day, I came across a paper on a data structure called the zip tree that essentially runs this process the other direction – it begins with the randomized skiplist and uses it to build a randomized binary search tree. The idea is surprisingly similar to the one shown above:
- Find a mechanism for encoding a skiplist as a binary search tree.
- Take the rules for a randomized skiplist and, one at a time, convert them into structural invariants on the binary search tree.
- Based on the encoding scheme, map the insertion and deletion algorithms of skiplists onto the new binary search tree.
The resulting tree, the “zip tree,” is closely related to a treap but using top-down rebalancing primitives rather than bottom-up rotation-based restructuring. This top-down restructuring is, I believe, equivalent to the treap’s use of insertions followed by rotations and “rotate downward” deletions, though probably slightly easier to code up. Unless I’m mistaken, the net effect is that you essentially get a treap, except that priorities are geometrically-distributed rather than uniformly-distributed and there’s a new tiebreaking rule to handle the (common) case when ties arise.
I wrote up a post on Stack Overflow explaining what the resulting zip tree looks like, as well as how to derive it. It was a fun exercise to work through, and I’m now wondering whether I can introduce this into CS166! Running this equivalence both ways – first to derive a deterministic skiplist and second to derive the zip tree – might make for a fun exploration of how ideas cross-pollinate one another.