Archives

Tachyum AI Mastered FP8 To Reach FP32 Precision in Updated Version of AI White Paper

Tachyum

Tachyum released the second edition of the “Tachyum Prodigy on the Leading Edge of AI Industry Trends” whitepaper featuring updates such as the implementation of improved 8-bit floating point (FP8) quantization aware techniques with adaptive scaling achieving 32-bit floating point (FP32) accuracy.

“This is one of the most significant AI milestones Tachyum wanted to achieve before tape-out to ensure we have what it takes to make FP8 with sparsity and super-sparsity mainstream AI technology.”

In the white paper, Tachyum demonstrates the optimality of the FP8 format for quantizing deep neural networks including weights, activations and gradients, exploiting the fact that floating point numbers provide better coverage than 8-bit integer data type. The results show that FP8 quantized networks can maintain accuracy on par – or even exceed the accuracy – of baseline FP32 models.

Also Read: NICE Actimize Wins 2023 FinTech Breakthrough Award for Fraud Prevention Innovation with Advanced Mule Defense Solution

FP8 is essential for being able to do more with less. It achieves much higher performance at much lower power consumption and chip area than legacy technology like BFLOAT16. FP8 not only reduces the cost of computation but also memory requirements for large and rapidly growing AI models. The white paper features analysis and quantization errors for different models and datasets as well as how Tachyum amplifies the benefits of FP8 in terms of twice the performance, power efficiency and bandwidth reduction.

Visual models, large language AI models and generative AI are increasingly being included in a number of software applications, making AI an essential part of data processing and requiring tighter and lower latency integration into software. With FP8 capable of performing mainstream AI functions, leaders like Tachyum, are poised to help accelerate the rapid evolution of AI hardware technology. This will lead to a unification of specialized HPC and AI hardware modules into a single processing engine rather than integrating disparate chips into one package, which is a more costly and less satisfactory solution.

“Our experimental results show that FP8 enables faster training and reduced power consumption without any degradation in accuracy for a range of deep learning models,” said Dr. Radoslav Danilak, founder and CEO of Tachyum. “This is one of the most significant AI milestones Tachyum wanted to achieve before tape-out to ensure we have what it takes to make FP8 with sparsity and super-sparsity mainstream AI technology.”

SOURCE: Businesswire