Efficient Estimation of Word Representations in Vector Space, Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean, 2013International Conference on Learning Representations (ICLR 2013)DOI: 10.48550/arXiv.1301.3781 - Introduces the Word2Vec model, a significant advancement in learning dense, low-dimensional word embeddings that capture semantic and syntactic relationships.
Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - A comprehensive textbook covering the theoretical foundations and practical applications of deep learning, including the role of embeddings in processing sequential data.
Layers: Flux.Embedding, The Flux.jl Contributors, 2024 - Official documentation detailing the usage, initialization, and behavior of the Flux.Embedding layer within the Flux.jl deep learning framework.
GloVe: Global Vectors for Word Representation, Jeffrey Pennington, Richard Socher, Christopher Manning, 2014Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (Association for Computational Linguistics)DOI: 10.3115/v1/D14-1162 - Presents GloVe, an unsupervised model for learning global word vectors that combine advantages of global matrix factorization and local context window methods.
CS224n: Natural Language Processing with Deep Learning, Christopher Manning, Abigail See, John Hewitt, and others, 2023 (Stanford University) - Comprehensive course materials, including lectures and assignments, that extensively cover word embeddings, recurrent neural networks, and their applications in NLP.