Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 56 additions & 0 deletions dev-tools/ls-apis/README.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,62 @@ This tool basically works as follows:
. Then, it loads and validates the developer-maintained metadata (`api-manifest.toml`).
. Then, it applies whatever filter has been selected and prints out whatever information was asked for.

=== Deployment units

The upgrade DAG's nodes are _deployment units_: sets of Rust packages that are always shipped and updated together. For example, Sled Agent, Propolis, and the switch-zone components all update as one unit (the host OS image), so they form a single deployment unit.

Deployment units and their packages are declared in `api-manifest.toml`:

```toml
[[deployment_units]]
name = "Host OS"
packages = [
"omicron-sled-agent",
"propolis-server",
# ...switch zone components...
]
```

Each entry in `packages` is a _top-level package_: a Rust binary shipped in that deployment unit. ls-apis walks each top-level package's Cargo dependencies to discover the APIs it produces and consumes.

==== Subcomponents

Sometimes one binary contains a body of code whose API dependencies are best tracked separately from the rest of it. The motivating case is the Rack Setup Service (RSS): it lives in the `sled-agent-rack-setup` library crate, which is linked into the `omicron-sled-agent` binary, but it runs only once (at rack initialization) and has different semantics from the rest of the Sled Agent.

A _subcomponent_ is such a library, tracked as an API consumer in its own right:

```toml
[[deployment_units]]
name = "Host OS"
packages = [ "omicron-sled-agent", ... ]
subcomponents = [
{ name = "sled-agent-rack-setup", inside = "omicron-sled-agent", lifecycle = "rack-init" },
]
```

`inside` names the top-level package (in the same deployment unit) that links the subcomponent. When ls-apis walks that package's dependencies, the subcomponent's own crate is skipped, so any API dependency reachable only through the subcomponent is attributed to the subcomponent instead of to the containing binary. (A dependency the binary also reaches by another path is attributed to both.) The subcomponent is then walked separately, as its own consumer.

A subcomponent is assumed not to host any Dropshot APIs of its own, so subcomponents are not walked as API producers.

==== Lifecycle

A component's `lifecycle` describes when its code runs:

[cols="1,3",options="header"]
|===
|Lifecycle
|Meaning

|`steady-state`
|The component runs during normal operation. Its API dependencies are part of the upgrade DAG. This is the default.

|`rack-init`
|The component runs only at rack initialization, when the whole rack is at a single consistent version. It therefore cannot experience version skew during an online upgrade, so its API dependencies are excluded from the upgrade DAG and from cycle checks.

|===

`lifecycle` may currently be set only on a subcomponent; every top-level package is treated as `steady-state`.

The filtering is a little complicated but very important!

=== The purpose of filtering
Expand Down
91 changes: 50 additions & 41 deletions dev-tools/ls-apis/api-manifest.toml
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,21 @@ packages = [
"tfportd",
"wicketd",
]
# sled-agent-rack-setup is a library inside the omicron-sled-agent binary. It is
# listed here as a subcomponent so its API dependencies (which only matter
# during RSS) are reported separately from the steady-state dependencies of
# omicron-sled-agent. Setting `inside = "omicron-sled-agent"` means that when
# ls-apis walks omicron-sled-agent's Cargo dependencies, sled-agent-rack-setup
# itself is skipped, so API dependencies reachable only through it are
# attributed to the subcomponent rather than to omicron-sled-agent.
#
# The `rack-init` lifecycle marks this subcomponent as running only during rack
# initialization. Edges from `rack-init` components are excluded from the
# upgrade DAG, so that sled-agent-rack-setup doesn't impose any ordering
# constraints on the upgrade DAG.
subcomponents = [
{ name = "sled-agent-rack-setup", inside = "omicron-sled-agent", lifecycle = "rack-init" },
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking for feedback on the inside thing. I tried autodiscovering the inside by looking at the dependency graph but honestly that was a complete mess that was hard to understand both in the code and in this manifest. I feel like having an explicit inside makes both layers easier to understand.

]

# Installinator gets packaged into its own host OS image.
[[deployment_units]]
Expand Down Expand Up @@ -389,10 +404,12 @@ server_package_name = "nexus-lockstep-api"
versioned_how = "client"
versioned_how_reason = "Circular dependencies between Nexus and other services"
# Exactly these consumers (Rust packages specified in one or more of the
# `packages` arrays in the deployment_units table above) are allowed to use this
# API. ls-apis check will error out if the list of consumers is incorrect.
# `packages` arrays in the deployment_units table above, or via `subcomponents`
# entries in the same table) are allowed to use this API. ls-apis check will
# error out if the list of consumers is incorrect.
restricted_to_consumers = [
{ name = "omicron-sled-agent", reason = "Only RSS and sled-agent-sim use this API, neither of which is part of upgrade" },
{ name = "omicron-sled-agent", reason = "Only sled-agent-sim uses this API, which is not part of upgrade" },
{ name = "sled-agent-rack-setup", reason = "Only RSS uses this API, which is not part of upgrade" },
]

[[apis]]
Expand Down Expand Up @@ -548,24 +565,6 @@ Sled Agent uses a Sled Agent client for the `zone-bundle` binary, which is not
deployed.
"""

[[dependency_filter_rules]]
ancestor = "sled-agent-rack-setup"
client = "cockroach-admin-client"
evaluation = "not-deployed"
note = """
The Cockroach Admin client is only used during RSS, which we don't care about
for the purpose of upgrade.
"""

[[dependency_filter_rules]]
ancestor = "sled-agent-rack-setup"
client = "dns-service-client"
evaluation = "not-deployed"
note = """
The DNS service client is only used during RSS, which we don't care about for
the purpose of upgrade.
"""

[[dependency_filter_rules]]
ancestor = "transient-dns-server"
client = "dns-service-client"
Expand Down Expand Up @@ -799,20 +798,22 @@ note = "Sled Agent uses Client::localhost() to connect to ddm."
permalinks = [
"https://github.com/oxidecomputer/omicron/blob/6cef874/sled-agent/src/ddm_reconciler.rs#L56",
"https://github.com/oxidecomputer/omicron/blob/6cef874/sled-agent/src/bootstrap/bootstore_setup.rs#L92",
"https://github.com/oxidecomputer/omicron/blob/6cef874/sled-agent/src/rack_setup/service.rs#L1086",
"https://github.com/oxidecomputer/omicron/blob/6cef874/sled-agent/src/sled_agent.rs#L1378",
]

[[intra_deployment_unit_only_edges]]
server = "omicron-sled-agent"
client = "mg-admin-client"
note = """
Sled Agent only queries the switch zone on the same sled,
which is within the same deployment unit as the global
zone.
Sled Agent invokes EarlyNetworkSetup::init_switch_config
(services.rs:3431) to configure uplinks on the local switch zone. This in turn
constructs an MgdClient pointed at the local switch zone
(early-networking/lib.rs:442), which is within the same deployment unit as the
global zone.
"""
permalinks = [
"https://github.com/oxidecomputer/omicron/blob/6cef874/sled-agent/src/bootstrap/early_networking.rs#L483",
"https://github.com/oxidecomputer/omicron/blob/32de20c/sled-agent/src/services.rs#L3431",
"https://github.com/oxidecomputer/omicron/blob/32de20c/sled-agent/early-networking/src/lib.rs#L442",
]

[[intra_deployment_unit_only_edges]]
Expand All @@ -835,39 +836,47 @@ server = "omicron-sled-agent"
client = "gateway-client"
note = """
BUG: Sled Agent creates two MGS clients. One of them
(early_networking.rs:376) queries the switch zone on
the same sled, which is within the same deployment unit
as the global zone, so this is okay.
(early-networking/lib.rs:331, called from services.rs:3431) queries the
switch zone on the same sled, which is within the same deployment unit as the
global zone, so this is okay.

The other one (early_networking.rs:277) queries both
switch zones, so it crosses deployment units. This needs
to be fixed.
The other one (early-networking/lib.rs:259, called from services.rs:1095)
queries both switch zones, so it crosses deployment units. This needs to be
fixed.

Reference: https://github.com/oxidecomputer/omicron/issues/9708
"""
permalinks = [
"https://github.com/oxidecomputer/omicron/blob/6cef874/sled-agent/src/bootstrap/early_networking.rs#L376",
"https://github.com/oxidecomputer/omicron/blob/6cef874/sled-agent/src/bootstrap/early_networking.rs#L277-L280",
"https://github.com/oxidecomputer/omicron/blob/32de20c/sled-agent/src/services.rs#L3431",
"https://github.com/oxidecomputer/omicron/blob/32de20c/sled-agent/early-networking/src/lib.rs#L331",
"https://github.com/oxidecomputer/omicron/blob/32de20c/sled-agent/src/services.rs#L1095-L1102",
"https://github.com/oxidecomputer/omicron/blob/32de20c/sled-agent/early-networking/src/lib.rs#L259-L262",
]

[[intra_deployment_unit_only_edges]]
server = "omicron-sled-agent"
client = "dpd-client"
note = """
BUG: Sled Agent creates two DPD clients. One of them (early_networking.rs:419)
is always to the switch zone on the same sled, which is in the same deployment
unit.
BUG: Sled Agent creates two DPD clients. One of them
(early-networking/lib.rs:378, called from services.rs:3431) is always to the
switch zone on the same sled, which is in the same deployment unit.

The other one (services.rs:1090) queries both switch zones, so it crosses
deployment units. This needs to be fixed.
The other one (services.rs:1104, iterating addresses returned by
EarlyNetworkSetup::lookup_uplinked_switch_zone_underlay_addrs) queries both
switch zones, so it crosses deployment units. This needs to be fixed.

Reference: https://github.com/oxidecomputer/omicron/issues/9708
"""
permalinks = [
"https://github.com/oxidecomputer/omicron/blob/6cef874/sled-agent/src/bootstrap/early_networking.rs#L419",
"https://github.com/oxidecomputer/omicron/blob/6cef874/sled-agent/src/services.rs#L1090",
"https://github.com/oxidecomputer/omicron/blob/32de20c/sled-agent/src/services.rs#L3431",
"https://github.com/oxidecomputer/omicron/blob/32de20c/sled-agent/early-networking/src/lib.rs#L378",
"https://github.com/oxidecomputer/omicron/blob/32de20c/sled-agent/src/services.rs#L1104-L1115",
]

# Note: sled-agent-rack-setup (RSS) is marked with `lifecycle = "rack-init"`
# in its subcomponent declaration, so its API consumption is excluded from the
# upgrade DAG entirely. No IDU annotation is needed for edges from RSS.

[[intra_deployment_unit_only_edges]]
server = "tfportd"
client = "dpd-client"
Expand Down
Loading
Loading
You are viewing a condensed version of this merge commit. You can view the full changes here.