Software and Hardware Ecosystem

New · Open Source

Kerb - LLM Development Toolkit

Python toolkit for building production-ready LLM applications. Modular utilities for prompts, RAG, agents, structured outputs, and multi-provider support.

Was this section helpful?

References

DeepSpeed: System Optimizations for Large-Scale Model Training, Samyam Rajbhandari, Cong Li, Zhun Liu, Guangxuan Xiao, Andreas Santarosa, Tiyab Sattar, Sheng Shen, Mao Ye, and Yuxiong He, 2020 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI '20) (USENIX) DOI: 10.5555/3446002.3446019 - Introduces the DeepSpeed library and its ZeRO memory optimization techniques for training large models.
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism, Mohammad Shoeybi, Mostofa Patwary, Raul Puri, Patrick LeGresley, Jared Casper, and Bryan Catanzaro, 2019 arXiv preprint arXiv:1909.08053 DOI: 10.48550/arXiv.1909.08053 - Describes model and pipeline parallelism strategies implemented in Megatron-LM for very large language models.
Fully Sharded Data Parallel (FSDP), PyTorch Documentation, 2022 - Official documentation for PyTorch's native FSDP implementation, explaining its use for large-scale distributed training.
NVIDIA H100 GPU Architecture In-Depth, NVIDIA, 2022 (NVIDIA Technical Whitepaper) - Technical whitepaper providing architectural details of NVIDIA's Hopper H100 GPU, including Tensor Cores and NVLink interconnects.
Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - A comprehensive textbook providing foundational knowledge of deep learning principles and algorithms.