Python toolkit for building production-ready LLM applications. Modular utilities for prompts, RAG, agents, structured outputs, and multi-provider support.
Was this section helpful?
Decoupled Weight Decay Regularization, Ilya Loshchilov and Frank Hutter, 2017International Conference on Learning Representations (ICLR)DOI: 10.48550/arXiv.1711.05101 - Introduces AdamW, a variant of Adam that applies weight decay separately from the gradient update for better generalization.