Triton Language Documentation, OpenAI, 2024 (OpenAI) - Official documentation for Triton, a Python-based DSL for writing high-performance GPU kernels.
CUDA C++ Programming Guide, NVIDIA Corporation, 2024 (NVIDIA Corporation) - Comprehensive guide to programming and optimizing for NVIDIA GPUs, covering memory hierarchy and parallelism.
Accelerated Linear Algebra (XLA), Google, 2024 (Google) - Official documentation describing XLA's role in optimizing TensorFlow computations through operator fusion and compilation.