Large scale distributed AI model training, model parallelisation, low-level GPU acceleration, make GPUs go brrrrr