CS166 Archive

This page contains archived versions of the Stanford CS166 (Data Structures) webpage in the quarters I've taught it. All internal links should be valid, though external links may no longer be functional.

2017-2018 Academic Year
Spring 2018

This iteration of CS166 was a blast to teach. After many, many hours of reading papers, reverse-engineering intuitions, and creating monster slide decks, I was able to replace coverage of the DC3 suffix array construction algorithm with the newer, faster, but more complicated SA-IS algorithm, which has great theoretical and practical performance. I also introduced fusion trees to the course, replacing an older lecture on disjoint-set forests, and I'm quite proud of how those lectures turned out. I learned a ton in the course of putting those lectures together and hope that those slides hold up well!

I also made a number of major touch-ups to some of the existing lectures, including a mostly redesigned lecture on x-fast and y-fast tries, better visualizations for the Fischer-Heun RMQ structure, and a mathematically streamlined presentation of the analysis of linear probing.

I made a number of revisions to the problem sets and coding assignments from last quarter. I removed the questions from Spring 2016 that asked students to derive KMP as a special case of Aho-Corasick and Fenwick trees as an optimized version of augmented trees, since those questions had the unfortunate side-effect of being answerable with a quick Google search without much conceptual understanding. In their place, I added in a new coding assignment involving SA-IS and suffix array searching and some questions about different types of balanced trees. I'm thinking that in the next offering of CS166 I'll keep the SA-IS assignment (and, in fact, probably soup it up!), but will likely cut the balanced trees question in favor of more practice with tree augmentation (for example, giving students a pluggable augmented tree and asking them to implement various data structures or algorithms with it). On PS4, I added in a new question about a linear-time algorithm for building weight-balanced trees and slightly adjusted the coding question to address it, which I may dial back in future quarters to focus on other topics. One change I'm quite happy with is the introduction of a new question about cardinality estimation on PS5, which with a little bit of tuning I think could become a mainstay!

I have a ton of notes to myself of what to work on for future quarters. I think there's space for a survey lecture on approximate membership query data structures (Bloom filters, quotient filters, and cuckoo filters, possibly with Bloomier filters) given how popular these topics were in the final projects and how nicely they dovetail with the existing randomized structures we've covered. I'd like to introduce coding questions that have a performance optimization component so that students can get a feel for what techniques are out there for speeding data structures up in practice - as well as the gap between theory and practice. My lecture on Fibonacci heaps could use a few fixups with regards to the intuition behind the extract-min procedure, and I think I should move away from the traditional presentation of splay trees (which uses a mixture of the amortized and startup costs) toward a more streamlined model that folds those costs into the construction cost or exposes them separately. The problem sets could probably use a little bit of rebalancing, as well, and I'd like to offer a little more guidance on experimental design for the final projects.

As usual, it was a ton of fun seeing all the final projects from this quarter. One project in particular looks like it may be publishable (a clever analysis of k-d trees), and I was blown away by another project on rotation distances in binary search trees through an isometry between polygon triangulations and BSTs. I also learned several new structures, like majority-inverter graphs, cell lists, hollow heaps, learned index structures, Bloomier filters, area minimum queries, Hopfield networks, relaxed radix-based trees, persistent B-trees, and distributed hash tables. Some of these seem like prime candidates for integration into future offerings of CS166, and I'm looking forward to playing around with them!

A huge thanks to this quarter's CS166 students and staff for making this class so much fun to teach!

2015-2016 Academic Year
Spring 2016

This second iteration of CS166 was a lot of fun! I'd learned a lot about data structures since the last time I taught this course, so there were a few new topics (namely, the analysis of 5-independence in linear probing) and the problem sets were more interesting and more polished than in the first iteration. The final projects this time around continued to shine. Many are publishable, and some of the explanations we saw were so good that I'm encouraging students to post their findings online somewhere.

I moved a few of the topics around this time, putting string data structures closer to the beginning where they tie in nicely with RMQ structures and deferring balanced trees until later. I rebuilt my lecture on Aho-Corasick from scratch and it turned out a lot better than before and combined the lectures on dynamic graphs and Euler tour trees into a single survey lecture on dynamic graph algorithms. I also cleaned up the presentation of disjoint-set forests, the split and join operations on trees, and splay trees (and I'm particularly proud of that last one). Next time around, I'll see if I can work in some other topics from the final presentations.

2013-2014 Academic Year
Spring 2014

This was my first time teaching CS166 and it was a wonderful experience. The students, the staff, and I all learned a lot and the final project presentations were just wonderful. I'm looking forward to teaching this class again in the future!