An Introduction to Statistical Learning: With Applications in R, Gareth James, Daniela Witten, Trevor Hastie, Rob Tibshirani, 2021 (Springer) - Provides a clear explanation of K-Means, its algorithm, and limitations, including initialization issues and assumptions about cluster shapes.
Pattern Recognition and Machine Learning, Christopher Bishop, 2006 (Springer)DOI: 10.1007/b139369 - Offers a detailed treatment of K-Means, including its objective function and discussions on its sensitivity to initialization and geometric assumptions.
k-means++: The Advantages of Careful Seeding, David Arthur, Sergei Vassilvitskii, 2007Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms (Society for Industrial and Applied Mathematics)DOI: 10.1145/1283383.1283494 - Introduces a widely adopted initialization algorithm, k-means++, which significantly improves the quality and consistency of K-Means results by addressing its sensitivity to initial centroid placement.