Xavier (Glorot) Initialization

New · Open Source

Kerb - LLM Development Toolkit

Python toolkit for building production-ready LLM applications. Modular utilities for prompts, RAG, agents, structured outputs, and multi-provider support.

Was this section helpful?

References

Understanding the difficulty of training deep feedforward neural networks, Xavier Glorot, Yoshua Bengio, 2010 Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS), Vol. 9 (JMLR.org) DOI: 10.5555/2078696.2078720 - The foundational paper introducing Xavier initialization, explaining the problem of vanishing/exploding gradients and the variance preservation principle.
Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - A comprehensive textbook providing a broad explanation of deep learning concepts, including a detailed section on weight initialization techniques like Xavier.
torch.nn.init, PyTorch Development Team, 2022 (PyTorch) - Official documentation for PyTorch's initialization module, providing practical implementation details for Xavier (Glorot) initialization in PyTorch.