Better performance with the tf.data API, TensorFlow Authors, 2024 - This guide outlines techniques for building efficient input pipelines, including prefetching, caching, and parallel transformations, which are key for keeping the GPU utilized.
Mixed precision training, TensorFlow Authors, 2024 - This official guide details how to implement mixed precision training in TensorFlow, enabling the use of float16 to reduce memory usage and accelerate computations on compatible GPUs, thereby allowing larger batch sizes and increasing throughput.
NVIDIA System Management Interface (nvidia-smi), NVIDIA Corporation, 2024 (NVIDIA Corporation) - The official documentation for the command-line utility used for real-time monitoring of GPU utilization, memory usage, and other important statistics, which is essential for diagnosing performance issues.
NVIDIA Deep Learning Performance Guide, NVIDIA Corporation, 2023 (NVIDIA Corporation) - This guide provides best practices and optimization techniques for deep learning training and inference on NVIDIA GPUs, covering aspects like data transfer, kernel launch, and hardware-specific features to maximize computational throughput.