Enlarge / Illustration of a brain inside of a light bulb.Getty Images
Researchers claim to have developed a new way to run AI language models more efficiently by eliminating matrix multiplication from the process. This fundamentally redesigns neural network operations that are currently accelerated by GPU chips. The findings, detailed in a recent preprint paper from researchers at the University of California Santa Cruz, UC Davis, LuxiTech, and Soochow University, could have deep implications for the environmental impact and operational costs of AI systems.
Matrix multiplication (often abbreviated to “MatMul”) is at the center of most neural network computational tasks today,
→ Continue reading at Ars Technica