How NVMe SSD thermal throttling works

How NVMe SSD thermal throttling works

Products mentioned in this article

Disclaimer

This article constitutes a substantially enhanced and revised rendition of a post originally disseminated on Qiita in January 2020.

Introduction

A prevalent concern among users embarking on the utilization of NVMe SSDs pertains to the aspect of “temperature” and its associated heat generation.

Our NVMe SSDs have a feature called Thermal Throttling, which controls heat by reducing performance to prevent excessive temperature rise.

In this article, we will introduce the settings, behavior, and effects of this thermal throttling, as well as situations where it is activated.

Understanding the thermal throttling of NVMe SSDs enables adjustments based on real usage scenarios, proper cooling setup, stable operation in challenging cooling environments, and more.

Summary

  • Thermal throttling is a function that reduces performance to control heat and prevent significant temperature rise in SSDs.
  • The behavior of thermal throttling varies among products, making it important to understand the SSD’s thermal throttling settings and behavior before use.

NVMe SSD Thermal Throttling

Our NVMe SSDs implement thermal throttling as a means to achieve Host Controlled Thermal Management (HCTM), a temperature management function specified in the NVMe 1.0 specification.

HCTM allows setting two temperatures (TMT1 and TMT2) to control temperature rise and prevent overheating.

However, the values of TMT1 and TMT2, as well as the behavior of thermal throttling (such as the extent of performance reduction), depend on the specific product.

Figure: Example of HCTM behavior (based on NVMe specification)

Among the examples shown in the figure (SSD A and SSD B), SSD A triggers strong throttling as soon as the temperature exceeds TMT1, and it continues until the temperature decreases sufficiently.

In contrast, SSD B engages weak throttling as long as the temperature remains below TMT2, even if it surpasses TMT1. This aims to minimize performance impact while managing temperature.

Thus, the behavior of thermal throttling is specific to each product, emphasizing the need to understand the SSD’s thermal throttling settings and behavior before use.

“Temperature” influences thermal throttling

The settings that govern the behavior of Host Controlled Thermal Management (HCTM), namely TMT1 and TMT2, are defined using a temperature parameter retrievable through the Self-Monitoring Analysis and Reporting Technology (S.M.A.R.T.) attributes.

For instance, when extracting S.M.A.R.T. attributes using our software, LiveMonitor, the outcome appears as illustrated below.

Figure: S.M.A.R.T. attributes and temperature of NVMe SSDs acquired by LiveMonitor (in green box)

n the realm of NVMe specifications, temperature is measured in absolute units (Kelvin). Thus, converting the value 0x136 (which translates to 310 in decimal) to Celsius (subtracting 273) equates to 37 degrees Celsius, as depicted in the chart.

To elaborate, in NVMe SSDs with active HCTM, the thermal throttling is orchestrated based on the monitoring of the aforementioned temperature, aligning with the configured values of TMT1 and TMT2.

It is imperative to acknowledge that this temperature may not necessarily reflect the SSD’s surface temperature. This is due to the fact that the NVMe specification does not define the exact location from which the temperature reported through S.M.A.R.T. attributes is derived.

Consequently, even if measurements akin to the depiction below are captured through thermal cameras, it should be emphasized that the “temperature” reported through S.M.A.R.T. attributes might not necessarily correspond with these measured results.

Figure: Example of M.2 SSD surface temperature measurement results using a thermal camera (idle)

Temperature and performance Variations due to Thermal Throttling

Let’s delve into the experimental outcomes of applying thermal throttling to our NVMe SSDs.

The NVMe SSD employed in these experiments is the 1,920 GB model (temperature-extended variant) from our NVMe SSD SN8E Series. It’s noteworthy that the configuration values (temperatures) for HCTM are set at 80 degrees Celsius for TMT1 and 85 degrees Celsius for TMT2.

In this study, we compared temperature increments with and without the presence of a heat sink.

Please be advised that the experimental results mentioned in this article stem from our specific testing environment. Consequently, these outcomes might not be universally replicated across all scenarios.

Evaluation results (without heat sink)

First, the actual measurement results without heat sinks are shown.

Figure: SSD temperature and performance change (without heat sink, 4-second moving average)

The horizontal axis represents the elapsed time since the start of measurement (moving to the right signifies the passage of time). The vertical axis has two scales: the left vertical axis represents Write performance (higher up indicates higher performance), while the right vertical axis denotes SSD temperature.

Upon observing this graph, it becomes evident that when the SSD temperature reaches the set TMT1 threshold of 80 degrees Celsius, thermal throttling is triggered, resulting in a performance drop of up to 100 MB/s. As the SSD temperature decreases to around 75 degrees Celsius, the thermal throttling ceases, and the performance returns to its original state. This cyclic behavior is observed.

In previous experiments, there were instances where exceedingly aggressive thermal throttling led to nearly a tenfold reduction in performance. Our products, on the other hand, demonstrate a more restrained decrease in performance even in scenarios necessitating substantial thermal throttling due to high temperatures. This, while still achieving the primary goal of thermal throttling – which is to curtail temperature escalation.

Evaluation results (with heat sink)

Next, we show the actual measurement results with a heat sink. Note that the values of TMT1 and TMT2 are the same.

Figure: SSD temperature and performance variation (with heatsink, 4-second moving average)

When comparing with the scenario without a heat sink, applying the same load for the same duration didn’t cause the SSD temperature to reach the TMT1 threshold (80 degrees Celsius). As a result, thermal throttling didn’t engage.

This outcome underscores the substantial impact of the heat sink’s efficacy.

Thermal camera view of heat generation

Lastly, we present the results obtained from using a thermal camera to visually depict the heat generation characteristics of the M.2 NVMe SSD.

The following pictures are taken under load without heat sinks.

Figure: Temperature near M.2 SSD confirmed by thermal camera (without heat sink)

This image illustrates that the NAND flash memory (encased in a BGA package) exhibits the highest temperature. Given the surface temperature of 79 degrees Celsius, it is reasonable to assume that the interior of the BGA package experiences even higher temperatures.

Subsequently, an image of the identical load with the heatsink applied is provided.

Figure: Temperature near M.2 SSD confirmed by thermal camera (with heat sink)

As depicted in the photograph, even under the same load, the temperature is reduced by more than 10 degrees compared to when the heatsink is absent. Moreover, the heat is uniformly distributed across the entirety of the heatsink. It appears that heat dissipation is being efficiently facilitated.

This observation further underscores the pivotal role of heatsinks. They effectively curtail the rise in SSD temperature, consequently minimizing the likelihood of thermal throttling activation. This, in turn, enables the SSD to sustain and showcase its high performance levels.

Heatsinks come in various materials and shapes, and their selection necessitates comprehensive consideration, aligning with factors such as the dimensions of the equipment in which the SSD is installed and the overall cooling mechanism.

Conclusion

In this article, we’ve delved into the thermal throttling aspects of NVMe SSDs, using our products as examples to illustrate their configuration settings and real-world behavior.

As elucidated at the outset, comprehending the setup and behavior of NVMe SSD thermal throttling, along with the customization aligned with the SSD’s usage environment and applications, facilitates the construction of an effective cooling infrastructure and engenders more stable operations.

Furthermore, in cases where SSD performance doesn’t meet expectations, it’s equally important to inspect the thermal throttling settings, monitor the SSD temperature during operation, and ascertain the activation status of thermal throttling.

Reference / Note

  1. NVM Express, “NVM ExpressTM Base Specification,” Revision 1.3d, March 20, 2019
  2. The NVMe Base Specification refers to this temperature as the “Composite Temperature

Trademarks of Other Companies
Although registered trademark marks are not indicated in the articles, company names and product names appearing in the articles are generally trademarks or registered trademarks of the respective companies.

About the article
The content of this article is information at the time of publication. Please note that the information is subject to change without notice.

Contact us