While running llm workloads via furiosa-llm, I can observe power draw over 180W via furiosa-smi info. However TDP of RNGD is known to be 180W(some even says 150W). Could you elaborate more on what furiosa-smi is measuring?
Hi,
Does the power use over 180W constantly or did you sometimes observe the power spike?
furiosa-smi continuously measures the actual, real-time power being drawn by the chip at each specific moment.
If you are observing the transient peak power, that’s a very common and expected behavior. The key point is that TDP (Thermal Design Power) is not the same as maximum power consumption.
Thermal design power (TDP), also known as thermal design point, is the maximum amount of heat that a computer component (like a CPU, GPU or system on a chip) can generate and that its cooling system is designed to dissipate during normal operation at a non-turbo clock rate (base frequency).
Some sources state that the peak power rating for a microprocessor is usually 1.5 times the TDP rating.[1] Graphics processing units are known to have even larger discrepancies between peak and TDP.[2]
However, if you are observing more than 180W consistently, that’s not an usual situation. Please let me know in that case.
I am benchmarking furiosa-llm with benchmarks codes in furiosa-ai/vllm. Power monitoring via furiosa-smi is including in the script and from the results that I am seeing, I see some cases that captured average power throughout the benchmark is over 180W.
I think I might have misunderstood the concept of TDP, since NVIDIA GPUs usually list max power consumption in their specifications. Could you please share the maximum power consumption for the RNGD if possible? I couldn’t find this information in the available documentation.
Thank you for your inquiry.
The TDP for the RNGD is specified at 180 W. Please allow me to clarify the terminology: TDP (Thermal Design Power) refers broadly to the power level for which the cooling system is designed under sustained heavy load. Definitions may vary slightly across companies, but this interpretation aligns with industry convention. However, it does not equal the maximum instantaneous power draw.
In our internal benchmarking, most of the targeted workloads display a long-term average power consumption around 180 W. However, we have observed certain workloads where the average has reached approximately 190 W—we are aware of these cases and are treating them as part of our ongoing optimization efforts.
As the hardware is currently at the ES (Engineering Sample) stage, power-optimization is still in progress, and we also have a plan to include power-capping functionality in upcoming firmware and driver updates. By the time of GA (General Availability), our objective is to ensure that heavy workloads operate with a long-term average power draw within the 180 W specification envelope.
Please feel free to let us know if you have any further questions or would like additional detail.
I understand that TDP can be configured differently depending on the hardware setup, and that power optimization is ongoing for the current target workloads. Thank you for clarifying the terminology.
Would it then be reasonable to assume that the maximum power draw of the single RNGD card is around 300W ~ 450W, since it appears that a single RNGD card requires a 600W power cable(12VHPWR)?
Many other hardwares usually provide their maximum power draw(could be called maximum TDP) in the specifications, so I’d like to ask in order to have a comparable reference point across different devices.
Thank you for helping clarify this point.
At the current stage, the RNGD can draw up to 300 W instantaneously under certain peak load conditions.
As mentioned earlier, our plan is to further limit this maximum power draw. By the GA (General Availability) stage, we intend to implement tighter control so that both average and peak power consumption remain within the defined TDP envelope.
Thanks for letting me know.
Achieving such a significant reduction in power consumption without compromising performance is remarkable. It appears to be the result of RNGD’s unique TCP architecture and extensive optimization efforts. I’m looking forward to seeing further developments!
Thank you for your kind words and encouragement!
The peaks mentioned occur only in a small subset of operators, and thus for most heavy workloads, the average power consumption remains around 180 W. For these peak cases, the compiler is being enhanced to perform energy-efficient scheduling within power constraints, with minimal performance impact. This feature is currently under internal validation and will be released once verification is complete.