Environment
| Node |
GPU SKU |
Kernel |
compute-runtime (libze-intel-gpu1) |
L0 loader (libze1) |
TW spikes |
| node-A |
B70 (0x8086:E223) |
6.17.0-35-generic |
26.09.37435.12 |
1.28.0 |
49 |
| node-C |
B70 (0x8086:E223) |
6.17.0-1009-intel |
26.09.37435.12 |
1.28.0 |
2 |
| node-D |
B60 |
6.14.0-1011-intel |
25.40.35563.10 |
1.26.2 |
17 |
| node-B |
B70 (0x8086:E223) |
6.17.0-1009-intel |
26.18.38308.1 |
1.28.2 |
0 |
OS: Ubuntu 24.04 | xpumanager: 1.3.5-20251216 | Monitoring: 54 hrs, all nodes idle (0% GPU util)
Observed behavior
zesPowerGetEnergyCounter() periodically returns energy values of ~100–300 J on GPU cards that have been continuously idle. The sysfs hwmon energy*_input counter for the same card at the same moment continues accumulating normally, confirming the hardware energy is intact.
xpumd reads the L0 counter every ~5s and computes power as ediff / tdiff. When the counter drops to ~200 J from ~22 MJ, the subtraction underflows as uint64, producing a ~3.69 TW result reported via hw_power_watts.
Captured values at spike moment
| Time (UTC) |
GPU (BDF) |
zesPowerGetEnergyCounter (L0, power-1) |
sysfs energy1_input |
hw_power_watts |
| 2026-06-09 19:44:10 |
node-A 0001:6b:00.0 |
209.8 J |
22,543,283.9 J |
3,688,785,853,643 W |
| 2026-06-09 22:02:11 |
node-A 0001:91:00.0 |
173.4 J |
23,329,654.5 J |
3,690,483,216,906 W |
Statistics (54-hour window)
- L0 counter resets: 345
- TW spike events: 68
- sysfs counter drops (node-C only): 67
Related: intel/xpumanager#130
Environment
libze-intel-gpu1)libze1)OS: Ubuntu 24.04 | xpumanager: 1.3.5-20251216 | Monitoring: 54 hrs, all nodes idle (0% GPU util)
Observed behavior
zesPowerGetEnergyCounter()periodically returns energy values of ~100–300 J on GPU cards that have been continuously idle. The sysfshwmon energy*_inputcounter for the same card at the same moment continues accumulating normally, confirming the hardware energy is intact.xpumd reads the L0 counter every ~5s and computes power as
ediff / tdiff. When the counter drops to ~200 J from ~22 MJ, the subtraction underflows as uint64, producing a ~3.69 TW result reported viahw_power_watts.Captured values at spike moment
zesPowerGetEnergyCounter(L0, power-1)energy1_inputhw_power_watts0001:6b:00.00001:91:00.0Statistics (54-hour window)
Related: intel/xpumanager#130