Skip to content
Draft
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions build/i2c/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1381,6 +1381,8 @@ impl ConfigGenerator {
all.sort();

for (rail, (device, index)) in &all {
// if we update this code to be more clever than just to-lowercase'ing the rail names,
// you might need to go update the mapping in `validate-api`!
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this is now in the control-plane-agent build script...

write!(
&mut self.output,
r##"
Expand Down Expand Up @@ -1776,6 +1778,7 @@ pub fn codegen(settings: impl Into<CodegenSettings>) -> Result<()> {
Ok(())
}

#[derive(Debug, Clone)]
pub struct I2cDeviceDescription {
pub device: String,
pub description: String,
Expand Down
82 changes: 76 additions & 6 deletions drv/i2c-devices/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -66,24 +66,31 @@ macro_rules! pmbus_read {
}

macro_rules! pmbus_rail_read {
($device:expr, $rail:expr, $cmd:ident) => {
(@raw => $device:expr, $rail:expr, $cmd_code:expr, $len:expr $(,)?) => {
$device
.write_read_reg::<u8, [u8; $cmd::CommandData::len()]>(
$cmd::CommandData::code(),
.write_read_reg::<u8, [u8; $len]>(
$cmd_code,
&[PAGE::CommandData::code(), $rail],
)
.map_err(|code| Error::BadRead {
cmd: $cmd::CommandData::code(),
cmd: $cmd_code,
code,
})
};

($device:expr, $rail:expr, $cmd:ident $(,)?) => {{
let cmd_code = $cmd::CommandData::code();
const CMD_LEN: usize = $cmd::CommandData::len();

pmbus_rail_read!(@raw => $device, $rail, cmd_code, CMD_LEN)
.and_then(|rval| {
$cmd::CommandData::from_slice(&rval).ok_or(Error::BadData {
cmd: $cmd::CommandData::code(),
})
})
};
}};

($device:expr, $rail:expr, $dev:ident::$cmd:ident) => {{
($device:expr, $rail:expr, $dev:ident::$cmd:ident $(,)?) => {{
use $dev::{PAGE, $cmd};
pmbus_rail_read!($device, $rail, $cmd)
}};
Expand Down Expand Up @@ -271,6 +278,69 @@ pub trait Validate<T: core::convert::Into<drv_i2c_api::ResponseCode>> {
}
}

// grumble grumble, copied from `gateway_messages::sp_to_mgs::PmbusStatus`
// grumble grumble, also basically the same as `ereports/src/pwr`
pub struct PmbusStatus {
pub status_word: u16,
pub status_vout: Result<u8, PmbusStatusError>,
pub status_iout: Result<u8, PmbusStatusError>,
pub status_temperature: Result<u8, PmbusStatusError>,
pub status_cml: Result<u8, PmbusStatusError>,
pub status_other: Result<u8, PmbusStatusError>,
pub status_input: Result<u8, PmbusStatusError>,
pub status_mfr_specific: Result<u8, PmbusStatusError>,
pub status_fans_1_2: Result<u8, PmbusStatusError>,
pub status_fans_3_4: Result<u8, PmbusStatusError>,
}

/// An error for querying PMBus `STATUS_*` registers.
#[derive(Copy, Clone, Debug, PartialEq)]
pub enum PmbusStatusError {
BadRead { cmd: u8, code: ResponseCode },
BadData { cmd: u8, },
}

impl PmbusStatus {
/// Attempt to read a [`PmbusStatus`] from the given device and rail.
///
/// This function returns an error if the initial attempt to obtain `STATUS_WORD` from the
/// device fails, otherwise returnining successfully even if "leaf" status bytes were unable
/// to be read, either due to ephemeral hiccups, or that status byte being unsupported by
/// the device queried.
Comment on lines +306 to +309
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely convinced this behavior for STATUS_WORD is correct. If you look at the bits in STATUS_WORD, you'll see that it's basically a summary of which other registers have bits set --- there's a bit in STATUS_WORD that is set if any fault bit in STATUS_IOUT, one for faults in STATUS_VOUT, and so on. So, we actually can reasonably interpret all the other registers even if we're missing STATUS_WORD.

That said, I think it's also reasonable-ish to fail fast if we get an error that indicates there is no device to talk to at all; if we get a NAK that indicates its not present, we don't need to spend more time doing all the other I2C reads. I dunno.

pub fn try_read_from(dev: &I2cDevice, rail_idx: u8) -> Result<Self, PmbusStatusError> {
use pmbus::commands::*;
use PmbusStatusError as Error;

Ok(PmbusStatus {
// Status word *must* succeed, otherwise we don't have reasonable data to return.
// We may want to consider making some/all of these retryable, but for now you either
// get them or you don't.
status_word: pmbus_rail_read!(dev, rail_idx, STATUS_WORD)?.0,
status_vout: pmbus_rail_read!(dev, rail_idx, STATUS_VOUT).map(|v| v.0),
status_iout: pmbus_rail_read!(dev, rail_idx, STATUS_IOUT).map(|v| v.0),
status_temperature: pmbus_rail_read!(dev, rail_idx, STATUS_TEMPERATURE).map(|v| v.0),
status_cml: pmbus_rail_read!(dev, rail_idx, STATUS_CML).map(|v| v.0),
status_other: pmbus_rail_read!(dev, rail_idx, STATUS_OTHER).map(|v| v.0),
status_input: pmbus_rail_read!(dev, rail_idx, STATUS_INPUT).map(|v| v.0),
status_fans_1_2: pmbus_rail_read!(dev, rail_idx, STATUS_FANS_1_2).map(|v| v.0),
status_fans_3_4: pmbus_rail_read!(dev, rail_idx, STATUS_FANS_3_4).map(|v| v.0),

// Unfortunately, STATUS_MFR_SPECIFIC *is* defined in the pmbus crate, but doesn't have a
// "structured" representation, so instead use a raw accessor. It *could* be argued
// that since we're not actually peeking at any of the introspection stuff, we could do
// the same for all the items above, and save ourselves a little indirection, but for now
// just hole-punch the minimum amount necessary. Indexing is fine here because we know
// (statically) the returned value is `[u8; 1]`.
status_mfr_specific: pmbus_rail_read!(
@raw => dev,
rail_idx,
CommandCode::STATUS_MFR_SPECIFIC as u8,
1,
).map(|v| v[0]),
})
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love that this will bail out if any read returns an error. I think it would be much better to change the PmbusStatus type so that each register's field was a Result<u8, i2c::ResponseCode> or something. That way, we can always communicate whatever registers we were able to successfully read to upstack software even if we encounter some weird I2C bus weather or something. Also, if (as I mentioned in my other comment) a device without fans does choose to NAK one of the STATUS_FANS registers, that doesn't result in us being permanently unable to read its status.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do! I wondered whether to make these Options or Results, and I should definitely do one of those.

I wonder if we could capture "this device supports these statuses" in the pmbus crate and maybe codegen based on this so we don't spend time doing I2C we know is going to be NAK'd, but this could also be a good follow-up.

Since we're not using device-specific methods here, I would imagine we would want to pass in some kind of bit-flag of which ones to check or not, and just have a None or Err(Unsupported).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could capture "this device supports these statuses" in the pmbus crate and maybe codegen based on this so we don't spend time doing I2C we know is going to be NAK'd, but this could also be a good follow-up.

Yeah, we could either do this in the pmbus crate or in the codegen itself. Since we have a string saying which device it is, the codegen could have its own understanding of which registers to skip for certain devices. It probably should fall back to querying all registers for device strings it doesn't recognize, maybe along with a build warning.

I do worry slightly about a device firmware revision changing its behavior to something that doesn't match the codegen's understanding, but that feels pretty unlikely. For instance, an RAA229620A VRM is not going to suddenly grow fans if its firmware is updated!


Incidentally, there is another option, which is the PMBus QUERY command (§11.13 PMBus Specification, Part II, Version 1.3.1). This allows you to ask a device whether it supports a particular PMBus command, such as STATUS_FANS_1_2 and get back a response that tells you whether the device supports it. I'm not sure how useful this is to us here, though --- if the goal is mostly to avoid I2C traffic for unsupported registers, using QUERY to ask "hey, do you have STATUS_FANS_1_2? okay, do you have STATUS_FANS_3_4?" results in just as much I2C traffic if the device doesn't have those registers, and more if it does. We could probably avoid that by QUERYing once and caching the result somewhere, but the current code is not structured in a way that gives us a place to stash that, so it would be a bit more complex.

At the end of the day, I just don't really think that's particularly worthwhile: since we know all the devices on the board at build time, we can just decide which registers to read when we do the code generation, and we don't have to dynamically detect capabilities. I think that QUERY is more intended for the use case of commodity motherboards which might end up talking to any arbitrary commodity PSUs, where the firmware doesn't know what PMBus commands are supported until the system boots up, and therefore must dynamically detect them. But, I felt like it was worth mentioning, if only to write down why we probably don't need to use it.

}
}

pub mod adm127x;
pub mod adt7420;
pub mod at24csw080;
Expand Down
2 changes: 2 additions & 0 deletions task/control-plane-agent/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ drv-caboose = { path = "../../drv/caboose" }
drv-caboose-pos = { path = "../../drv/caboose-pos" }
drv-hf-api = { path = "../../drv/hf-api", optional = true }
drv-cpu-seq-api = { path = "../../drv/cpu-seq-api", optional = true }
drv-i2c-api = { path = "../../drv/i2c-api" }
drv-ignition-api = { path = "../../drv/ignition-api", optional = true }
drv-lpc55-update-api = { path = "../../drv/lpc55-update-api" }
drv-monorail-api = { path = "../../drv/monorail-api", optional = true }
Expand Down Expand Up @@ -53,6 +54,7 @@ userlib = { path = "../../sys/userlib", features = ["panic-messages"] }
static-cell = { path = "../../lib/static-cell" }

[build-dependencies]
build-i2c = { path = "../../build/i2c" }
build-util = { path = "../../build/util" }
anyhow = { workspace = true }
idol = { workspace = true }
Expand Down
58 changes: 58 additions & 0 deletions task/control-plane-agent/build.rs
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,64 @@ fn main() -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
if let Some(cfg) = cfg {
write_keys(cfg)?;
}

do_pmbus()?;

Ok(())
}

fn do_pmbus() -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
let out_dir = std::env::var("OUT_DIR")?;
let dest_path =
std::path::Path::new(&out_dir).join("pmbus_mapping.rs");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unimportant turbo-nit: std::path::PathBuf is imported at the top of the file, so perhaps we should just change that import to use std::path::{PathBuf, Path} and change this to

Suggested change
std::path::Path::new(&out_dir).join("pmbus_mapping.rs");
Path::new(&out_dir).join("pmbus_mapping.rs");

let file = std::fs::File::create(&dest_path)?;
let mut file = std::io::BufWriter::new(file);

// Generate the necessary rail names
if let Err(e) = build_i2c::codegen(build_i2c::Disposition::Devices) {
println!("cargo::error=failed to generate I2C devices: {e}");
std::process::exit(1);
}
Comment on lines +51 to +55
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpicky: i might put this part in main rather than in the pmbus-specific codegen? we will want it if other stuff also starts wanting i2c config in control-plane-agent...


let devices = build_i2c::device_descriptions()
.collect::<Vec<_>>();

let mut pmbus_rail_names = std::collections::BTreeSet::new();
let mut pmbus_rail_dupes = 0;
for dev in devices {
Comment on lines +57 to +62
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: since all we do with devices is iterate over it, and build_i2c::device_descriptions() returns an iterator, i think the collect is unnecessary:

Suggested change
let devices = build_i2c::device_descriptions()
.collect::<Vec<_>>();
let mut pmbus_rail_names = std::collections::BTreeSet::new();
let mut pmbus_rail_dupes = 0;
for dev in devices {
let devices = build_i2c::device_descriptions();
let mut pmbus_rail_names = std::collections::BTreeSet::new();
let mut pmbus_rail_dupes = 0;
for dev in devices {

or even just:

Suggested change
let devices = build_i2c::device_descriptions()
.collect::<Vec<_>>();
let mut pmbus_rail_names = std::collections::BTreeSet::new();
let mut pmbus_rail_dupes = 0;
for dev in devices {
let mut pmbus_rail_names = std::collections::BTreeSet::new();
let mut pmbus_rail_dupes = 0;
for dev in build_i2c::device_descriptions() {

i doubt the build time hit for iterating twice and allocating here is all that noticeable, but...every nanosecond counts, i guess?

// We only need to map PMBus devices
let Some(ref pmbus) = dev.pmbus else {
continue;
};

// pmbus devices must have a refdes
assert!(dev.device_id.is_some());
Comment on lines +68 to +69
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we actually care about this here?


// Aggregate a list of all PMBus-visible rails
for rail in pmbus.rails.iter() {
// `BTreeSet::insert` return value means "is unique", which is the inverse of
// `BTreeMap::insert().is_some()`!
if !pmbus_rail_names.insert(rail.name.clone()) {
pmbus_rail_dupes += 1;
println!("cargo::error=Duplicate Rail name: {:?}", rail.name);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this error message is what the programmer will see if they have made an error in the app.toml. i think we can probably improve the user experience here a bit, at least by making the following changes:

  • downcasing the first letter to match other cargo error messages
  • including the name of the PMBus device that has the duplicate rail.
  • it would probably be nice if we could also say the refdes of the other device that has that rail, so we could say "both U420 and U69 define a power rail called V69P7_TURBOENCABULATOR_A2, your config file is invalid", but that would require a bigger change to also track which device a rail name in the map came from...
Suggested change
println!("cargo::error=Duplicate Rail name: {:?}", rail.name);
println!("cargo::error=PMBus device {} defines a power rail {:?} which already exists in the manifest", dev.device_id, rail.name);

or something like that?

}
}
}

// Create a mapping between rail names and generated accessor functions for obtaining
// the device handle and rail index
writeln!(file)?;
writeln!(file, "pub const PMBUS_RAIL_TO_I2C_DEVICE_MAP: [(&[u8], fn(userlib::TaskId) -> (drv_i2c_api::I2cDevice, u8)); {}] = [", pmbus_rail_names.len())?;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, take it or leave it: part of me wants to suggest that this be a struct rather than a tuple, in the hopes that the non-generated code consuming the LUT is a bit more readable. but i'm not sure if this is worth the ffort or not. up to you.

for rail in pmbus_rail_names.iter() {
write!(file, " (b\"{rail}\", ")?;
// build_i2c *also* only to-lowercases the rail names to make functions
write!(file, "crate::i2c_config::pmbus::{}", rail.to_lowercase())?;
Comment on lines +88 to +89
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eventually, if we wanted to be fancy, we could probably change up build-i2c so that it actually provides the name of the i2c_config::pmbus function for looking up that device, instead of relying on that code not changing. but also, that code is not going to change without breaking a bunch of other stuff, so 🤷‍♀️

could be worth adding to the backlog of "non-important non-blockers"...

writeln!(file, "),")?;
}
writeln!(file, "];")?;

assert_eq!(pmbus_rail_dupes, 0);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: it would be nice if this had a panic message that said something to the effect of "your config is invalid"...also, it's maybe worth checking whether panicking or returning an anyhow::Error here results in a nicer UX for error reporting in build scripts?


Ok(())
}

Expand Down
4 changes: 4 additions & 0 deletions task/control-plane-agent/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -627,3 +627,7 @@ mod idl {
}

include!(concat!(env!("OUT_DIR"), "/notifications.rs"));

include!(concat!(env!("OUT_DIR"), "/i2c_config.rs"));

include!(concat!(env!("OUT_DIR"), "/pmbus_mapping.rs"));
Loading