Tired of out-of-memory errors derailing your data analysis? There's a better way to handle huge arrays in Python.
A learning implementation of GPT-2 training from scratch (inspired by NanoGPT) on WikiText-103. This project demonstrates the ability to implement transformer architectures and set up modern ML ...