Programming Massively Parallel Processors: A Hands-on Approach, David B. Kirk, Wen-mei W. Hwu, 2016 (Morgan Kaufmann) - Provides a foundational understanding of GPU architecture and parallel programming principles, essential for comprehending how GPUs accelerate deep learning workloads.
NVIDIA TensorRT Documentation, NVIDIA Corporation, 2024 (NVIDIA Corporation) - Official developer guide for NVIDIA TensorRT, a platform for high-performance deep learning inference optimization and deployment on NVIDIA GPUs.
CUDA Semantics, PyTorch Authors, 2024 - Official documentation explaining how to effectively utilize NVIDIA GPUs within the PyTorch framework for accelerating deep learning computations.