Skip to content
Draft
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
154 changes: 154 additions & 0 deletions docs/design/step_ca_ephemeral_admin_certs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,154 @@
# Ephemeral Admin Certificates with step-ca

## Goal

Allow an admin to authenticate with OIDC and receive a short-lived FLARE admin
certificate without adding OIDC handling to the FLARE server or clients.

```text
admin CLI -> step CLI -> step-ca -> OIDC provider
admin CLI <- short-lived admin certificate/key
admin CLI -> existing FLARE mTLS login and job signing
```

The built-in `step_ca` provider delegates OIDC discovery, browser login, token
validation, claim mapping, and certificate issuance to step-ca.

## Trust Model

step-ca signs admin certificates with an intermediate CA rooted in the FLARE
project root. Existing servers and clients validate the resulting chain with
`rootCA.pem`. The FLARE server cannot mint an admin certificate or job
signature unless it controls step-ca, its signing key, or an admin private key.

The issued leaf certificate must contain the fields FLARE already consumes:

- `commonName`: authenticated admin identity
- `organizationName`: FLARE organization
- `unstructuredName`: `project_admin`, `org_admin`, `lead`, or `member`

The admin private key is generated on the admin machine by `step ca
certificate` and is not sent to the OIDC provider or FLARE server.

## Runtime Behavior

An ephemeral admin startup kit contains `ephemeral_admin_cert` instead of
static `client.crt` and `client.key` files. The admin client:

1. Loads a valid cached credential or invokes its configured provider.
2. Validates the certificate chain, validity, identity fields, allowed role,
and certificate/private-key match.
3. Uses the existing certificate challenge, mTLS, authorization, and job-signing
paths.
4. Reacquires credentials when the certificate enters its renewal window.

The certificate-chain support is implemented separately. This feature adds no
OIDC token format, server login mode, or server-signed job manifest to FLARE.

## Provisioning

Static and ephemeral admins can coexist in one `project.yml`:

```yaml
participants:
- name: static-admin@example.com
type: admin
org: example_org
role: project_admin

- name: sso-admin-kit
type: admin
ephemeral_admin_cert:
provider: step_ca
renewal_window: 60
provider_config:
ca_url: https://step-ca.example.com
provisioner: nvflare-admin-oidc
cert_ttl: 24h
command_timeout: 300
```

The ephemeral participant omits `org` and `role`; both come from the issued
certificate. Its name identifies a generic startup kit, not a user. The kit
contains `rootCA.pem` and provider configuration but no static admin
certificate or private key. Server and site startup kits are unchanged.

## Provider and Cache

`ephemeral_admin_cert.provider` is a built-in provider name or a
`module:function` path. A provider receives its configuration and the project
root certificate, and returns local certificate/key paths. FLARE applies the
same validation to built-in and custom provider results.

Valid credentials are cached per OS user under
`~/.nvflare/ephemeral_admin_certs`. The cache entry is bound to the provider
configuration and project root. Files are private to the OS user, concurrent
CLI processes serialize acquisition, and immutable credential directories
prevent cert/key replacement races.

The cache is required because each `nvflare` command starts a new process;
without it every command would repeat browser login. Users must not share an OS
account because that also shares its cached credential. Deleting the cache
forces a fresh OIDC login.

## step-ca Requirements

Operators configure step-ca, not FLARE, with:

- an intermediate CA signed by the FLARE project root
- an OIDC provisioner and matching IdP loopback redirect URI
- the OIDC client credentials and scopes needed by the X.509 template
- an X.509 template that writes the required FLARE identity fields
- a short maximum/default certificate duration, normally 24 hours
- renewal disabled so extending access requires another OIDC login

The root CA private key returns to offline storage after signing the
intermediate. The intermediate key remains with step-ca and may be protected by
an HSM/KMS. Neither private key is distributed to FLARE servers, sites, or
admins.

### Organization and Role Mapping

The step-ca template must map an exact, allowlisted IdP role to one
`(organization, FLARE role)` pair. Organization and role must not be accepted as
independent user-controlled claims.

Example mappings for one project and organization:

```text
nvflare-demo-example-project_admin -> (example, project_admin)
nvflare-demo-example-org_admin -> (example, org_admin)
nvflare-demo-example-lead -> (example, lead)
nvflare-demo-example-member -> (example, member)
```

When several mapped roles for the same organization are present, the template
selects the highest privilege in this order:

```text
project_admin > org_admin > lead > member
```

Mappings that produce more than one organization are ambiguous and must fail
closed. Separate provisioners/templates per organization are the simplest
deployment model. The FLARE server does not map or rewrite certificate
organization or role values.

## Lifetime and Clone Behavior

The built-in provider requests a 24-hour certificate by default. FLARE does not
perform revocation checks, so disabling a user prevents new issuance but does
not invalidate an existing certificate. The lifetime must cover expected queue
and deployment delays because clients verify that the signing certificate is
still valid when the job is deployed.

`clone_job` copies the original submitter signature without contacting the
admin client. A clone therefore becomes unusable after the original certificate
expires. FLARE reports this before cloning when it can inspect the stored
certificate. A future client-assisted clone could re-sign and resubmit the job.

## References

- [step-ca](https://smallstep.com/docs/step-ca/)
- [`step ca certificate`](https://smallstep.com/docs/step-cli/reference/ca/certificate/)
- [step-ca provisioners](https://smallstep.com/docs/step-ca/provisioners/)
73 changes: 71 additions & 2 deletions docs/programming_guide/provisioning_system.rst
Original file line number Diff line number Diff line change
Expand Up @@ -452,7 +452,8 @@ Edit the project.yml configuration file to meet your project requirements:
- "api_version" should be set to 3 or 4. Version 4 adds support for multi-study configuration (see :ref:`multi_study_guide`)
- "name" is used to identify this project.
- "participants" describes the different parties in the FL system, distinguished by type. For all participants, "name"
should be unique, and "org" should be defined in AuthPolicyBuilder. The "name" of the server should
should be unique. ``org`` is required except for ephemeral admin kit entries, whose organization comes from the
issued certificate. The "name" of the server should
be in the format of a fully qualified domain name. It is possible to use a unique hostname rather than FQDN, with
the IP mapped to the hostname by having it added to ``/etc/hosts``:

Expand All @@ -461,10 +462,78 @@ Edit the project.yml configuration file to meet your project requirements:
- "fed_learn_port" is the port number for communication between the FL server and FL clients
- "admin_port" is the port number for communication between the FL server and FL administration client
- Type "client" describes the FL clients, with one "org" and "name" for each client as well as "enable_byoc" settings.
- Type "admin" describes the admin clients with the name being a unique email. The role must be one of "project_admin", "org_admin", "lead" and "member".
- Type "admin" describes the admin clients. For traditional static
admin certificates, the name must be a unique email. For ephemeral
admin certificate startup kits, the name may be a unique kit name
such as ``sso-admin-kit`` because the real admin identity comes from
the short-lived certificate issued after login. Static admins must
define ``org`` and a role of "project_admin", "org_admin", "lead" or
"member". Ephemeral admin kit entries omit ``org`` and ``role``;
those values come from the issued certificate.
- "builders" contains all of the builders and the args to be passed into each. See the details in docstrings of the :ref:`bundled_builders`.
- "studies" (optional, requires ``api_version: 4``): defines named studies with per-study site enrollment and admin role mappings. See :ref:`multi_study_guide` for the full schema and examples.

Ephemeral admin certificate configuration
=========================================

Use ephemeral admin certificates when admin users should authenticate through an
external certificate provider instead of receiving long-lived private keys in
their startup kits. Server and client startup kits are unchanged.

The built-in ``step_ca`` provider delegates OIDC login and short-lived
certificate issuance to step-ca. FLARE only stores the provider configuration in
the generated admin startup kit and then validates the returned certificate
before using it. The admin machine must have the ``step`` CLI installed.

Example configuration:

.. code-block:: yaml

participants:
- name: static-admin@example.com
type: admin
org: nvidia
role: project_admin

- name: sso-admin-kit
type: admin
ephemeral_admin_cert:
provider: step_ca
renewal_window: 60
provider_config:
ca_url: https://step-ca.example.com
provisioner: nvflare-admin-oidc
cert_ttl: 24h
command_timeout: 300

Only admin participants with ``ephemeral_admin_cert`` receive SSO-backed startup
kits. The generated ``sso-admin-kit`` startup kit contains
``ephemeral_admin_cert`` in ``fed_admin.json`` and omits static admin
``client.crt`` and ``client.key``. Traditional admin participants still receive
static admin certificate material. The admin client invokes the configured
provider when the cached certificate is missing, invalid, expired, or close to
expiry. Cached certificate material is stored under
``~/.nvflare/ephemeral_admin_certs`` and can be removed manually if a fresh SSO
login is required. The returned certificate must chain to ``rootCA.pem``, match
its private key, contain a valid FLARE organization and admin role, and be valid
for the current time. If ``cert_ttl`` is omitted, the built-in ``step_ca``
provider requests ``24h``.

The certificate provider must map authenticated IdP claims to one allowed
``(organization, role)`` pair. For example, an IdP role such as
``nvflare-demo-example-project_admin`` can map to organization ``example`` and
role ``project_admin``. Perform this mapping in the step-ca X.509 template (or
in the IdP), using exact allowlisted values. Do not accept organization and role
as independent user-controlled claims. If several allowed roles for the same
organization are present, select the highest privilege; fail closed if the
organization is ambiguous.

Custom certificate providers can be configured with
``provider: module:function``. The function receives ``provider_config`` and
``root_ca_file`` and returns paths for the admin certificate and key.
FLARE performs the same certificate validation for custom providers as it does
for ``step_ca``.

.. _project_yml:

Default project.yml file
Expand Down
8 changes: 8 additions & 0 deletions docs/user_guide/admin_guide/deployment/overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -260,6 +260,14 @@ you will need to modify the corresponding script. The same applies to the other
The email to participate this FL project is embedded in the CN field of client certificate, which uniquely identifies
the participant. As such, please safeguard its private key, client.key.

Some projects use ephemeral admin certificates. In that case, the admin startup
kit contains ``ephemeral_admin_cert`` in ``fed_admin.json`` instead of static
``client.crt`` and ``client.key`` files. The admin client obtains a short-lived
admin certificate and private key from the configured provider when connecting
to the server, then uses the same certificate login and job-signing flow as a
static admin kit. The startup kit name can be a generic name such as
``sso-admin-kit``; the issued certificate contains the real admin identity.

.. attention::

You will need write access in the directory containing the "startup" folder because the "transfer" directory for
Expand Down
32 changes: 32 additions & 0 deletions docs/user_guide/admin_guide/security/identity_security.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,38 @@ The security of the system comes from the PKI credentials in the Startup Kits. A

:ref:`NVFlare Dashboard <nvflare_dashboard_ui>` is a website that supports user and site registration. Users will be able to download their Startup Kits (and other artifacts) from the website.

Ephemeral Admin Certificates
----------------------------
For admin users, NVFLARE can also provision startup kits that do not contain a
static admin certificate or private key. In this mode, the admin startup kit
contains an ``ephemeral_admin_cert`` provider configuration. When the admin
client starts, it asks that provider for a short-lived admin certificate and
private key, validates the returned certificate against the project
``rootCA.pem``, and then uses the normal certificate login and job-signing path.
Valid ephemeral admin cert/key material is cached under
``~/.nvflare/ephemeral_admin_certs`` so repeated CLI commands do not require a
new browser login until the certificate is invalid, expired, or close to
expiry. The cache is private to the OS user, so administrators should not share
an OS account. The startup kit can use a generic name such as
``sso-admin-kit``; the actual admin identity comes from the certificate issued
after SSO login.

The built-in provider is ``step_ca``. With this provider, step-ca owns OIDC
login, role claim handling, and certificate issuance. The issued certificate
must contain the same FLARE identity fields that the existing PKI path consumes:
``commonName`` for the admin identity, ``organizationName`` for the FLARE org,
and ``unstructuredName`` for one FLARE role: ``project_admin``, ``org_admin``,
``lead``, or ``member``.

The step-ca template must map an exact, allowlisted IdP role to both the FLARE
organization and role. This binds the authorization tuple before the
certificate reaches the FLARE server; the server does not derive or rewrite
either value.

This mode reduces the distribution risk of long-lived admin private keys while
preserving the existing server and client trust model. Server and FL client
startup kits still use their normal PKI credentials.


.. _federated_authorization:

Expand Down
1 change: 1 addition & 0 deletions nvflare/apis/utils/format_check.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
"job_name": r"^[A-Za-z0-9][A-Za-z0-9._-]*$",
"relay": r"^[A-Za-z0-9-_]+$",
"admin": r"^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}$",
"admin_kit": r"^[A-Za-z0-9-_]+$",
"email": r"^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}$",
"org": r"^[A-Za-z0-9_]+$",
"simple_name": r"^[A-Za-z0-9_]+$",
Expand Down
Loading
Loading