diff --git a/.chloggen/3256.yaml b/.chloggen/3256.yaml new file mode 100644 index 0000000000..04de72b336 --- /dev/null +++ b/.chloggen/3256.yaml @@ -0,0 +1,4 @@ +change_type: deprecation +component: exceptions +note: Update exception recording guidelines to not use span events. +issues: [3256] diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index 5d354e73b9..192a42cfca 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -96,7 +96,7 @@ these requirements: - **Spans**: Set status to Error, populate `error.type`, set description when helpful - **Metrics**: Include `error.type` attribute for filtering and analysis -- **Exceptions**: Record as span events or log records using SDK APIs +- **Exceptions**: Record as log records - **Consistency**: Same `error.type` across spans and metrics for same operation ## Common Issues to Flag diff --git a/AREAS.md b/AREAS.md index 0b78b5f191..e3d06ce2d9 100644 --- a/AREAS.md +++ b/AREAS.md @@ -39,7 +39,7 @@ their owners, related project (and project board) as well as its current status. | Semantic Conventions: Database | [semconv-db-approvers](https://github.com/orgs/open-telemetry/teams/semconv-db-approvers) | https://github.com/open-telemetry/community/blob/main/projects/completed-projects/database-client-semconv.md | https://github.com/orgs/open-telemetry/projects/73 | `area:db` | `inactive` | The SIG is inactive. Bugs and bugfixes are welcome. For substantial changes, follow the [new project process](https://github.com/open-telemetry/community/blob/main/project-management.md) | | Semantic Conventions: FaaS | [specs-semconv-maintainers](https://github.com/orgs/open-telemetry/teams/specs-semconv-maintainers) | https://github.com/open-telemetry/community/blob/main/projects/completed-projects/faas.md | N/A | `area:faas` | `inactive` | The SIG is inactive. Bugs and bugfixes are welcome. For substantial changes, follow the [new project process](https://github.com/open-telemetry/community/blob/main/project-management.md) | | Semantic Conventions: JVM | [semconv-jvm-approvers](https://github.com/orgs/open-telemetry/teams/semconv-jvm-approvers) | N/A | https://github.com/orgs/open-telemetry/projects/49 | `area:jvm` | `inactive` | The SIG is inactive. Bugs and bugfixes are welcome. For substantial changes, follow the [new project process](https://github.com/open-telemetry/community/blob/main/project-management.md) | -| Semantic Conventions: Logs | [semconv-log-approvers](https://github.com/orgs/open-telemetry/teams/semconv-log-approvers) | N/A | N/A | `area:log`, `area:event`, `area:exception` | `accepting_contributions`, `active` | The SIG is looking for contributions! | +| Semantic Conventions: Logs | [semconv-log-approvers](https://github.com/orgs/open-telemetry/teams/semconv-log-approvers) | N/A | https://github.com/orgs/open-telemetry/projects/65 | `area:log`, `area:event`, `area:exception` | `accepting_contributions`, `active` | The SIG is looking for contributions! | | Semantic Conventions: Mainframe | [sig-mainframe-approvers](https://github.com/orgs/open-telemetry/teams/sig-mainframe-approvers) | https://github.com/open-telemetry/community/blob/main/projects/mainframe.md | N/A | `area:mainframe`, `area:zos` | `accepting_contributions`, `active` | The SIG is looking for contributions! | | Semantic Conventions: Profiling | [profiling-approvers](https://github.com/orgs/open-telemetry/teams/profiling-approvers) | N/A | N/A | `area:profile`, `area:pprof` | `accepting_contributions`, `active` | The SIG is looking for contributions! | | Semantic Conventions: .NET | [semconv-dotnet-approver](https://github.com/orgs/open-telemetry/teams/semconv-dotnet-approver) | N/A | N/A | `area:dotnet`, `area:aspnetcore`, `area:signalr`, `area:kestrel` | `accepting_contributions`, `active` | SIG is driven by members of the .NET runtime team. Contributions are welcomed but must be aligned with the .NET runtime features/roadmap | diff --git a/areas.yaml b/areas.yaml index de54794d0c..d86c3eb9b3 100644 --- a/areas.yaml +++ b/areas.yaml @@ -190,7 +190,7 @@ areas: - name: "semconv-log-approvers" github: semconv-log-approvers project: "N/A" - board: "N/A" + board: "https://github.com/orgs/open-telemetry/projects/65" labels: - area:log - area:event diff --git a/docs/exceptions/exceptions-spans.md b/docs/exceptions/exceptions-spans.md index d83b212b90..e2eadbd87d 100644 --- a/docs/exceptions/exceptions-spans.md +++ b/docs/exceptions/exceptions-spans.md @@ -4,7 +4,8 @@ linkTitle: Spans # Semantic conventions for exceptions on spans -**Status**: [Stable][DocumentStatus] +**Status**: [Deprecated][DocumentStatus]
+Use [Semantic conventions for exceptions in logs](exceptions-logs.md) instead. This document defines semantic conventions for recording application exceptions associated with spans. diff --git a/docs/general/recording-errors.md b/docs/general/recording-errors.md index 6e12b17548..e42423182a 100644 --- a/docs/general/recording-errors.md +++ b/docs/general/recording-errors.md @@ -20,8 +20,8 @@ Individual semantic conventions are encouraged to provide additional guidance. An operation SHOULD be considered as failed if any of the following is true: -- an exception is thrown by the instrumented method (API, block of code, or another instrumented unit) -- the instrumented method returns an error in another way, for example, via an error code +- an exception is thrown by the instrumented operation (API, block of code, or another instrumented unit) +- the instrumented operation returns an error in another way, for example, via an error code Semantic conventions that define domain-specific status codes SHOULD specify which status codes should be reported as errors by a general-purpose instrumentation. @@ -83,12 +83,12 @@ include it if the operation succeeded. ## Recording exceptions -When an instrumented operation fails with an exception, instrumentation SHOULD record -this exception as a [span event](/docs/exceptions/exceptions-spans.md) or a [log record](/docs/exceptions/exceptions-logs.md). +When the instrumented operation failed due to an exception: -It's RECOMMENDED to use the `Span.recordException` API or logging library API that takes exception instance -instead of providing individual attributes. This enables the OpenTelemetry SDK to -control what information is recorded based on application configuration. +- instrumentation SHOULD record this exception as a [log record](/docs/exceptions/exceptions-logs.md), +- instrumentation SHOULD follow [recording errors on spans](#recording-errors-on-spans) + and [recording errors on metrics](#recording-errors-on-metrics) + on capturing exception details on these signals. It's NOT RECOMMENDED to record the same exception more than once. It's NOT RECOMMENDED to record exceptions that are handled by the instrumented library. @@ -100,22 +100,41 @@ to the caller should be recorded (or logged) once. ```java public boolean createIfNotExists(String resourceId) throws IOException { Span span = startSpan(); + long startTime = System.nanoTime(); try { create(resourceId); + + recordMetric("acme.resource.create.duration", System.nanoTime() - startTime); + return true; } catch (ResourceAlreadyExistsException e) { - // not recording exception and not setting span status to error - exception is handled - // but we can set attributes that capture additional details + // we do not set span status to error and the "error.type" attribute + // as the exception is not an error, + // but we still log and set attributes that capture additional details + logger.withEventName("acme.resource.create.error") + .withAttribute("acme.resource.create.status", "already_exists") + .withException(e) + .debug(); + span.setAttribute(AttributeKey.stringKey("acme.resource.create.status"), "already_exists"); + + recordMetric("acme.resource.create.duration", System.nanoTime() - startTime); + return false; } catch (IOException e) { - // recording exception here (assuming it was not recorded inside `create` method) - span.recordException(e); - // or - // logger.warn(e); + // this exception is expected to be handled by the caller + // and could be a transient error + logger.withEventName("acme.resource.create.error") + .withException(e) + .warn(); + + String errorType = e.getClass().getCanonicalName(); - span.setAttribute(AttributeKey.stringKey("error.type"), e.getClass().getCanonicalName()) + span.setAttribute(AttributeKey.stringKey("error.type"), errorType); span.setStatus(StatusCode.ERROR, e.getMessage()); + + recordMetric("acme.resource.create.duration", System.nanoTime() - startTime, + AttributeKey.stringKey("error.type"), errorType); throw e; } }