expose PMBus device info from build-i2c codegen; set IS_PMBUS#2519
expose PMBus device info from build-i2c codegen; set IS_PMBUS#2519hawkw wants to merge 6 commits into
build-i2c codegen; set IS_PMBUS#2519Conversation
This commit updates our Git dep on the `management-gateway-service` repo to oxidecomputer/management-gateway-service@745a508. This is primarily in order to pick up oxidecomputer/management-gateway-service#488, which adds new `DeviceCapabilities` bits for PMBus devices and for devices which have VPD.
This commit adds a new `PmbusDeviceDescription` to the `build-i2c` crate. `I2cDeviceDescription` now includes an optional `PmbusDeviceDescription` field which is `Some` for PMBus devices. The presence of this field can be used to determine when a device is a PMBus device, such as when determining if the `IS_PMBUS` capability should be advertised. I did this by adding a whole new optional struct rather than just sticking an `is_pmbus` bool on `I2cDeviceDescription`, because I felt like future code, such as what @jamesmunns is working on for #2463, will almost certainly want a way to get a PMBus device's rail and phase names in codegen that lives outside of `build-i2c` (i.e. in `task-validate-api`'s build script). Currently, there isn't a nice way to get this stuff in codegen that lives outside of `build-i2c`, so this should help make that nicer.
Now that we have the appropriate information available at codegen-time to know if an I2C device is a PMbus device, we can actually set this flag. Yay! Now the control plane can know which devices are PMBus!
732ab05 to
9526186
Compare
build-i2c codegen; set IS_PMBUS
| .flatten() | ||
| .enumerate() | ||
| .map(|(i, rail)| { | ||
| let phases = if let Some(ref phases_list) = power.phases |
There was a problem hiding this comment.
nit: Might be worth hoisting this up to the top level, and doing one arm for "has phases" and one for "doesn't have phases"? Then we could do a single assert for "lengths match", and then zip together rails and phases_list, instead of doing .get/indexing by i.
There was a problem hiding this comment.
hmm, so that ends up looking like this:
diff --git a/build/i2c/src/lib.rs b/build/i2c/src/lib.rs
index 3ca2e1487..a4201fe57 100644
--- a/build/i2c/src/lib.rs
+++ b/build/i2c/src/lib.rs
@@ -1827,35 +1827,29 @@ pub fn device_descriptions() -> impl Iterator<Item = I2cDeviceDescription> {
return None;
}
- let rails = power
- .rails
- .iter()
- .flatten()
- .enumerate()
- .map(|(i, rail)| {
- let phases = if let Some(ref phases_list) = power.phases
- {
- match phases_list.get(i) {
- Some(phases) => phases.clone(),
- None => {
- // The rails and phases arrays are expected to
- // be the same length; if this is not the case,
- // the config file is malformed!
- panic!(
- "PMBus device {device_id:?} is missing
- phases for its {i}th rail ({rail})",
- );
- }
- }
- } else {
- Vec::new()
- };
- PmbusRailDescription {
- name: rail.clone(),
- phases,
- }
- })
- .collect::<Vec<_>>();
+ let rails = power.rails.as_ref().map(|rails| {
+ if let Some(ref phases) = power.phases {
+ assert_eq!(rails.len(), phases.len(),
+ "PMBus device {device_id:?}'s `power.rails` and \
+ `power.phases` lists are not the same length"
+ );
+ rails.iter()
+ .cloned()
+ .zip(phases.iter().cloned())
+ .map(|(name, phases)| PmbusRailDescription {
+ name,
+ phases,
+ })
+ .collect()
+ } else {
+ rails.iter().cloned()
+ .map(|name| PmbusRailDescription {
+ name,
+ phases: Vec::new(),
+ })
+ .collect()
+ }
+ }).unwrap_or_default();
Some(PmbusDeviceDescription { rails })
});
which, incidentally, rustfmt decided to kinda give up on lol. Do you think that's nicer? because I'm happy to change it if you do...
There was a problem hiding this comment.
Here's my take on it, and I actually think this makes one thing a bit more clear: what we do when EITHER rails or power vecs are missing (right now unless we have both we return Some(vec![]):
diff --git a/build/i2c/src/lib.rs b/build/i2c/src/lib.rs
index 3ca2e1487..f6dcd0b43 100644
--- a/build/i2c/src/lib.rs
+++ b/build/i2c/src/lib.rs
@@ -1827,35 +1827,31 @@ pub fn device_descriptions() -> impl Iterator<Item = I2cDeviceDescription> {
return None;
}
- let rails = power
- .rails
+ // If we're missing rails OR phases, we have no rails.
+ //
+ // TODO: Do we want to match on this? Are we actually fine with 3/4 match arms just
+ // giving up and returning `Some(PmbusDeviceDescription { rails: vec![] })`?
+ let (Some(rails), Some(phases)) = (power.rails.as_ref(), power.phases.as_ref()) else {
+ return Some(PmbusDeviceDescription { rails: vec![] })
+ };
+
+ // If we have rails and phases, they MUST match each other's lengths
+ assert_eq!(
+ rails.len(),
+ phases.len(),
+ "PMBus device {device_id:?} has mismatched rails and phases",
+ );
+
+ // A phase for every rail, and every rail in its phase
+ let rails = rails
.iter()
- .flatten()
- .enumerate()
- .map(|(i, rail)| {
- let phases = if let Some(ref phases_list) = power.phases
- {
- match phases_list.get(i) {
- Some(phases) => phases.clone(),
- None => {
- // The rails and phases arrays are expected to
- // be the same length; if this is not the case,
- // the config file is malformed!
- panic!(
- "PMBus device {device_id:?} is missing
- phases for its {i}th rail ({rail})",
- );
- }
- }
- } else {
- Vec::new()
- };
- PmbusRailDescription {
- name: rail.clone(),
- phases,
- }
+ .zip(phases.iter())
+ .map(|(r, p)| PmbusRailDescription {
+ name: r.clone(),
+ phases: p.clone(),
})
- .collect::<Vec<_>>();
+ .collect();
+
Some(PmbusDeviceDescription { rails })
});Less nested:
// If we're missing rails OR phases, we have no rails.
//
// TODO: Do we want to match on this? Are we actually fine with 3/4 match arms just
// giving up and returning `Some(PmbusDeviceDescription { rails: vec![] })`?
let (Some(rails), Some(phases)) = (power.rails.as_ref(), power.phases.as_ref()) else {
return Some(PmbusDeviceDescription { rails: vec![] })
};
// If we have rails and phases, they MUST match each other's lengths
assert_eq!(
rails.len(),
phases.len(),
"PMBus device {device_id:?} has mismatched rails and phases",
);
// A phase for every rail, and every rail in its phase
let rails = rails
.iter()
.zip(phases.iter())
.map(|(r, p)| PmbusRailDescription {
name: r.clone(),
phases: p.clone(),
})
.collect();
Some(PmbusDeviceDescription { rails })There was a problem hiding this comment.
Hm, your code snippet implements different (and in this case, incorrect) behavior than what mine does. Not all devices have multiple phases, some devices have a list of rails and no corresponding list of phases. The requirement is that if there is a list of phases, it must be as long as the list of rails.
|
@hawkw This generally looks good; can you do a smoke test to make sure this didn't break |
|
what the heck happened here: https://github.com/oxidecomputer/hubris/actions/runs/26239754200/job/77222643916?pr=2519#step:4:288 |
I've been seeing these pretty often in other PRs. I'm not sure if it's some kind of cache failure, a build getting cancelled when it shouldn't, or just GitHub jank. It's starting to happen often enough to worry about imo. |
|
sigh... |
|
@mkeeter well, eliza@alfred ~ $ pfexec humility flash
humility: WARNING: archive in environment variable overriding archive in environment file
humility: attached to archive
humility: attaching with chip set to "STM32H753ZITx"
humility: attached to 1fc9:0143:XJACFXSEKTQJS via CMSIS-DAP
humility: flash/archive mismatch; reflashing
humility: skipping measurement token handoff
humility: erasing slot 12
humility: done
humility: skipping measurement token handoff
humility: resetting the SP, please wait...
humility: done with auxiliary flash
humility: flashing done
eliza@alfred ~ $ pfexec humility validate
humility: WARNING: archive in environment variable overriding archive in environment file
humility: attached to 1fc9:0143:XJACFXSEKTQJS via CMSIS-DAP
ID VALIDATION C P MUX ADDR DEVICE DESCRIPTION
0 present 1 B - 0x70 oximux16 Front FPGA virtual mux
1 validated 1 B 1:1 0x50 at24csw080 U.2 Sharkfin A VPD
2 validated 1 B 1:1 0x38 max5970 U.2 Sharkfin A hot swap controller
3 removed 1 B 1:1 0x6a nvme_bmc U.2 A NVMe Basic Management Command
4 validated 1 B 1:2 0x50 at24csw080 U.2 Sharkfin B VPD
5 validated 1 B 1:2 0x38 max5970 U.2 Sharkfin B hot swap controller
6 removed 1 B 1:2 0x6a nvme_bmc U.2 B NVMe Basic Management Control
7 validated 1 B 1:3 0x50 at24csw080 U.2 Sharkfin C VPD
8 validated 1 B 1:3 0x38 max5970 U.2 Sharkfin C hot swap controller
9 removed 1 B 1:3 0x6a nvme_bmc U.2 C NVMe Basic Management Control
10 validated 1 B 1:4 0x50 at24csw080 U.2 Sharkfin D VPD
11 validated 1 B 1:4 0x38 max5970 U.2 Sharkfin D hot swap controller
12 removed 1 B 1:4 0x6a nvme_bmc U.2 D NVMe Basic Management Control
13 validated 1 B 1:5 0x50 at24csw080 U.2 Sharkfin E VPD
14 validated 1 B 1:5 0x38 max5970 U.2 Sharkfin E hot swap controller
15 removed 1 B 1:5 0x6a nvme_bmc U.2 E NVMe Basic Management Control
16 validated 1 B 1:6 0x50 at24csw080 U.2 Sharkfin F VPD
17 validated 1 B 1:6 0x38 max5970 U.2 Sharkfin F hot swap controller
18 removed 1 B 1:6 0x6a nvme_bmc U.2 F NVMe Basic Management Control
19 validated 1 B 1:9 0x50 at24csw080 U.2 Sharkfin G VPD
20 validated 1 B 1:9 0x38 max5970 U.2 Sharkfin G hot swap controller
21 removed 1 B 1:9 0x6a nvme_bmc U.2 G NVMe Basic Management Control
22 validated 1 B 1:10 0x50 at24csw080 U.2 Sharkfin H VPD
23 validated 1 B 1:10 0x38 max5970 U.2 Sharkfin H hot swap controller
24 removed 1 B 1:10 0x6a nvme_bmc U.2 H NVMe Basic Management Control
25 validated 1 B 1:11 0x50 at24csw080 U.2 Sharkfin I VPD
26 validated 1 B 1:11 0x38 max5970 U.2 Sharkfin I hot swap controller
27 removed 1 B 1:11 0x6a nvme_bmc U.2 I NVMe Basic Management Control
28 validated 1 B 1:12 0x50 at24csw080 U.2 Sharkfin J VPD
29 validated 1 B 1:12 0x38 max5970 U.2 Sharkfin J hot swap controller
30 removed 1 B 1:12 0x6a nvme_bmc U.2 J NVMe Basic Management Control
31 validated 1 B 1:13 0x48 tmp117 Southwest temperature sensor
32 validated 1 B 1:13 0x49 tmp117 South temperature sensor
33 validated 1 B 1:13 0x4a tmp117 Southeast temperature sensor
34 present 2 F - 0x70 oximux16 Main FPGA virtual mux
35 removed 2 F 1:1 0x6a nvme_bmc M.2 A NVMe Basic Management Command
36 removed 2 F 1:2 0x6a nvme_bmc M.2 B NVMe Basic Management Command
37 validated 2 F 1:4 0x3c sbrmi CPU via SB-RMI
38 validated 2 F 1:4 0x4c sbtsi CPU temperature sensor
39 validated 2 F 1:7 0x50 at24csw080 Fan VPD
40 validated 2 F 1:8 0x4c tmp451 T6 temperature sensor
41 validated 3 H - 0x24 tps546b24a A2 3.3V rail
42 validated 3 H - 0x27 tps546b24a A2 5V rail
43 validated 3 H - 0x29 tps546b24a A2 1.8V rail
44 validated 3 H - 0x3a max5970 M.2 hot plug controller
45 validated 3 H - 0x54 ltc4282 12V MCIO hot plug controller
46 validated 3 H - 0x56 ltc4282 DIMM GHIJKL hot plug controller
47 validated 3 H - 0x55 ltc4282 DIMM ABCDEF hot plug controller
48 validated 3 H - 0x75 raa229620a South power controller (Core 0, SOC)
49 validated 3 H - 0x76 raa229620a North power controller (Core 1, VDDIO)
50 validated 3 H - 0x5c isl68224 SP5 power controller (V1P1, V1P8, V3P3)
51 validated 4 F - 0x39 max5970 NIC hot swap
52 validated 4 F - 0x25 tps546b24a T6 power controller
53 validated 4 F - 0x48 tmp117 Northwest temperature sensor
54 validated 4 F - 0x49 tmp117 North temperature sensor
55 validated 4 F - 0x4a tmp117 Northeast temperature sensor
56 validated 4 F - 0x20 max31790 Fan controller
57 validated 4 F - 0x67 bmr491 Intermediate bus converter
58 validated 4 F - 0x50 at24csw080 Cosmo VPD
59 validated 4 F - 0x11 lm5066i Fan hot swap controller (east)
60 validated 4 F - 0x12 lm5066i Fan hot swap controller (central)
61 validated 4 F - 0x13 lm5066i Fan hot swap controller (west)
62 validated 4 F - 0x14 adm127x Sled hot swap controller
eliza@alfred ~ $I haven't yet gotten as far as testing that MGS doesn't get sad upon discovering the new bits (which I sure hope it wouldn't), since the only lab systems that have management network are currently busy doing other things again (cf oxidecomputer/meta#984). I'll see if I can get ahold of something to test against... |
As suggested by @jamesmunns in [this comment][1]. I think this is a bit easier to follow now, thanks James! [1]: #2519 (comment)
This branch consists of three separate commits which build upon each other, and which I would like to rebase-merge as three separate commits to
master. Overall, the primary goal here is to start populating theIS_PMBUSbit in the control-plane-agent inventory'sDeviceCapabilities, which was added in oxidecomputer/management-gateway-service#488. In addition, I also tried to do some prep for future changes that want to generate more code for all PMBus devices, such as what @jamesmunns is working on in #2463.In particular, the three commits do the following:
Update the
management-gateway-servicedep to pick up the newDeviceCapabilitiesflagsAdd code in
build-i2c's public API that is exposed to other build scripts (thedevice_descriptions()API) to indicate which devices are PMBus devices. In particular,I2cDeviceDescriptionnow includes an optionalPmbusDeviceDescriptionfield, which isSomefor PMBus devices. The presence of this field can be used to determine when a device is a PMBus device, such as when determining if theIS_PMBUScapability should be advertised.I did this by adding a whole new optional struct rather than just sticking an
is_pmbusbool onI2cDeviceDescription, because I felt like future code, such as what @jamesmunns is working on for want to read PMBus status registers via MGS #2463, will almost certainly want a way to get a PMBus device's rail and phase names in codegen that lives outside ofbuild-i2c(i.e. intask-validate-api's build script). Currently, there isn't a nice wayto get this stuff in codegen that lives outside of
build-i2c, so thisshould help make that nicer.
Then, I've modified the codegen in
task-validate-api, which is whatcontrol-plane-agentultimately consumes for its codegen inventory devices, to include anis_pmbusfield so that we can use this to populate theDeviceCapabilitiesflag.Finally, actually set the flag in
control-plane-agent. :)