SentencePiece GitHub Repository and Documentation, Google, 2024 - Provides official source code, installation instructions, command-line tool usage, Python API examples, and detailed explanations of training parameters and model types.
Tokenizers - Hugging Face Documentation, Hugging Face, 2024 - Explains the principles of various subword tokenization algorithms, including SentencePiece, in the context of modern NLP models, offering practical implementation insights.
Natural Language Processing with Transformers, Lewis Tunstall, Leandro von Werra, Mario Šaško, 2022 (O'Reilly Media) - Offers an overview of tokenization techniques, including SentencePiece, within the framework of building and applying Transformer models, covering both theoretical and practical aspects.