Ilya's Papers to Carmack: Table of Contents

Oct 25, 2024

In 2019, Ilya Sutskever (former Chief Scientist at OpenAI and one of the leading lights of the ‘Google Brain’ AI era) sent John Carmack (computer scientist extraordinaire, created Doom) a list of ~40 papers to learn about AI.

That list is supposedly lost to the sands of time (Facebook’s email servers apparently wiped the list oops). But someone else apparently had access to the list and had a partial version of it saved!

I have a bit more free time on my hands these days, and I’m already reading a bunch of ML papers, so whats 30 more to add to the list?

This document will serve as a central hub for my notes on each paper. Feel free to follow along — the full paper set is here. Table of Contents will update as the relevant posts go live!

(Note that paper reviews may be published out of order)

Paper 1: Complextropy
Paper 2: RNNs
Paper 3: LSTMs
Paper 4: RNN Regularization
Paper 5: Transformers
Paper 6: MDL and NN Regularization
Papers 7 - 8: Pointer Networks, Conv Nets
Paper 9: Set seq2seq
Paper 10: Model Parallelism
Paper 11: ResNets
Paper 12: Dilated Convolutions
Paper 13: MPNNs

[Paper 14 is the same as paper 5]

Paper 15: Attention
Paper 16: Identity Mapping
Paper 17: Relational Networks
Papers 18: VLAEs
Paper 19: Relational Recurrent Networks
Paper 20: Complextropy Redux
Paper 21: Neural Turing Machines
Paper 22: DeepSpeech
Paper 23: Scaling Laws

[The rest of the ‘papers’ are textbooks or course curriculum. I did not review them.]

Conclusion and Final Notes

I’ve been working on this series on and off for over a year, and it’s weird to say that it’s finally done. Of course, I’ll keep reviewing papers in general. I have a long backlog of more modern papers that I would love to write about. But I’m never going to write another paper in the Ilya’s papers series, and that alone is strange. I started this series when I had 38 subscribers, and now there are ~35x that in this little corner of the web.

It’s funny how, even though we’re ending this series, the research isn’t anywhere close to finished. It’s all just one huge conversation, each new set of authors picking up where previous ones left off. In other words, more to say, more to be written. Thanks for joining me on this ride, and stay tuned for more paper reviews. I have notes for papers on diffusion modeling, speculative decoding, mechanistic interpretability, test time training, etc. and now that I’m done dragging my feet here we can get into some of the things that may be more immediately applicable for a modern ML researcher.

12 Grams of Carbon

Discussion about this post

Ready for more?