NVIDIA Collective Communications Library (NCCL) User Guide, NVIDIA Corporation, 2024 (NVIDIA Corporation) - Official documentation detailing the design and usage of NCCL, which provides highly optimized collective communication primitives, including Ring All-Reduce, for NVIDIA GPUs.
Dive into Deep Learning, Aston Zhang, Zack C. Lipton, Mu Li, Alex Smola and all D2L contributors, 2023 (Cambridge University Press) - An open-source textbook providing accessible explanations of distributed deep learning concepts, including All-Reduce, with practical examples.