TensorFlow Custom Operations Guide, TensorFlow Team, 2024 - This official guide provides comprehensive instructions and examples for creating and integrating custom operations in TensorFlow, covering kernel implementation, C++ operation registration, and Python API binding.
PyTorch Custom C++ and CUDA Extensions, PyTorch Team, 2024 - Official documentation detailing how to extend PyTorch with custom C++ and CUDA operations, including setup, compilation, and integration with the Python frontend.
MLIR: A Compiler Infrastructure for the End of Moore's Law, Chris Lattner, Mehdi Amini, Uday Bondhugula, Albert Cohen, Andy Davis, Jacques Pienaar, River Riddle, Tatiana Shpeisman, Nicolas Vasilache, Oleksandr Zinenko, 2021ACM Transactions on Programming Languages and Systems, Vol. 43 (ACM)DOI: 10.1145/3477174 - This foundational paper introduces MLIR, a key intermediate representation used in modern ML compilers, explaining its extensible dialect system which allows for representing and integrating custom operations effectively.