Understanding BFloat16: A Data Representation for High-Efficiency Machine Learning

Published October 2023

In the beginning stages of prototyping a machine learning model the underlying data type used for the weights often doesn't command much attention. However, by looking closer at recently trained models I noticed a common thread: many utilize the BFloat16 data type for their weights. This observation caught my curiosity about BFloat16 and the reasons behind its use.

When training a substantially large model, several critical considerations come into play, including the costs and speeds of training and inference, as well as the memory footprint of the model. These factors significantly influence the choice of using BFloat16 as the underlying data type, a rationale that becomes evident once you understand its design.

BFloat16 (Brain Floating Point Format 16) is a format representing a floating point number with 16-bits. However, it deviates from the IEEE-754 standard, dedicating fewer bits to the mantissa(fractional component) and more to the exponent. This allocation expands the range of possible values for a 16-bit floating point representation, equating it with the range of a 32-bit floating point representation. Since 16-bit operations are considerably faster than their 32-bit counterparts, you can expedite your model's training at a lower cost without sacrificing the range offered by a 32-bit floating-point data type. ^[0]

Source: Cerebras ^[1]

However, there's a trade-off. The mantissa's contraction in BFloat16, compared to Float32 or Float16, results in fewer significant bits. Theoretically, this difference makes each layer's outputs less precise. Yet, research indicates that this loss of specificity is generally inconsequential. The advantages of using BFloat16, particularly in Large Language Models (LLMs), far outweigh the precision offered by alternative formats. ^[2]

This conclusion is not just theoretical; it's evidenced by practical application. Several of the most formidable open-source LLMs, such as Meta's Llama-2, Mosaic ML's MPT, and TTI's Falcon, all rely on BFloat16 as their foundational data type. Their successes underscore the efficacy of BFloat16 in modern machine learning endeavors, marking it as a preferred choice for emerging models.

Integrating Bfloat16 into models requires compatible hardware. Although Google developed Bfloat16 for its TPUs, the data type’s adoption extends beyond these units. Today, a spectrum of hardware, including certain CPUs, NPUs, and GPUs have embraced its compatibility. Notably, Nvidia has incorporated BFloat16 support into their A100 Tensor Core GPUs. One of the key features enhancing the versatility of these hardware systems is their capability for mixed-precision operations. This allows for seamless interplay between models utilizing BFloat16 and those requiring Float32 data types. If you want to use BFloat16, it is very simple with PyTorch ^[3]

Understanding BFloat16: A Data Representation for High-Efficiency Machine Learning

Further Reading