Skip to content

Erase formatting after failed format --replace#18446

Open
tanabarr wants to merge 1 commit into
masterfrom
tanabarr/control-fmtreplace-rank-erase
Open

Erase formatting after failed format --replace#18446
tanabarr wants to merge 1 commit into
masterfrom
tanabarr/control-fmtreplace-rank-erase

Conversation

@tanabarr
Copy link
Copy Markdown
Contributor

@tanabarr tanabarr commented Jun 5, 2026

Steps for the author:

  • Commit message follows the guidelines.
  • Appropriate Features or Test-tag pragmas were used.
  • Appropriate Functional Test Stages were run.
  • At least two positive code reviews including at least one code owner from each category referenced in the PR.
  • Testing is complete. If necessary, forced-landing label added and a reason added in a comment.

After all prior steps are complete:

  • Gatekeeper requested (daos-gatekeeper added as a reviewer).

X
Signed-off-by: Tom Nabarro <thomas.nabarro@hpe.com>
@tanabarr tanabarr requested review from a team as code owners June 5, 2026 14:11
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 5, 2026

Errors are component not formatted correctly,Ticket number prefix incorrect,PR title is malformatted. See https://daosio.atlassian.net/wiki/spaces/DC/pages/11133911069/Commit+Comments,Unable to load ticket data
https://daosio.atlassian.net/browse/Erase

@daosbuild3
Copy link
Copy Markdown
Collaborator

@daosbuild3
Copy link
Copy Markdown
Collaborator

cmd.Debugf("Invoking SystemErase to clean up after failed format operation")

eraseReq := &control.SystemEraseReq{}
eraseResp, err := control.SystemErase(ctx, cmd.ctlInvoker, eraseReq)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this will work... SystemErase doesn't allow you to choose ranks or nodes.

I think you'll need to handle this from the daos_server that owns the engine. If the engine fails to join, and it's a replace operation, blow the storage away. The failure that triggered this request was happening at the join stage.

If the format itself fails, I don't think there's any risk of the engine coming up. If there's a partial failure, it's not a bad idea to clean up, but I think that would have to happen from the server side, too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants