The Random Transformer

Understand how transformers work by demystifying all the math behind them

https://osanseviero.github.io/hackerllama/blog/posts/random_transformer/?s=09

The Annotated Transformer

http://nlp.seas.harvard.edu/annotated-transformer/