temporalio · dustin-temporal · May 15, 2026 · May 15, 2026 · May 15, 2026 · May 15, 2026
@@ -29,6 +29,12 @@ Each source provides different levels of granularity, filtering options, monitor
 
 When used together, Cloud and SDK metrics measure the health and performance of your full Temporal infrastructure, including the Temporal Cloud Service and user-supplied Temporal Workers.
 
+:::tip New to Cloud metrics?
+
+Start with the [OpenMetrics Quickstart](/cloud/metrics/openmetrics#quickstart) to create a Service Account, generate an API key, and stream metrics into Datadog, Grafana Cloud, New Relic, ClickStack, or self-hosted Prometheus in about 5 minutes.
+
+:::
+
 ## Cloud Metrics
 
 Cloud metrics for all Namespaces in your account are available from the [OpenMetrics endpoint](/cloud/metrics/openmetrics), a Prometheus-compatible scrapable endpoint at `metrics.temporal.io`.

@@ -31,11 +31,56 @@ Future pricing may apply to high-volume usage that exceeds standard [limits](/cl
 
 Temporal Cloud's [OpenMetrics](https://openmetrics.io/) endpoint provides operational metrics for your Temporal Cloud workloads in industry-standard Prometheus format, enabling comprehensive monitoring across Namespaces, Workflows, and Task Queues with your existing observability stack.
 
+## Quickstart
+
+Stream metrics from Temporal Cloud into your observability tool in about 5 minutes.
+
+**Prerequisites**
+
+- An **Account Owner** or **Global Admin** role on the Temporal Cloud account. The Metrics Read-Only role is an account-level role and can only be granted by these roles. A Namespace Admin cannot complete these steps.
+- An account in the observability tool you want to use (Datadog, Grafana Cloud, New Relic, ClickStack, self-hosted Prometheus, etc.).
+
+**Steps**
+
+1. **Create a Service Account with the Metrics Read-Only role.**
+
+   In the Temporal Cloud UI, go to **Settings → Service Accounts → Create Service Account** and assign the **Metrics Read-Only** account-level role.
+
+2. **Generate an API key for the Service Account.**
+
+   Open the Service Account and create an API key. Copy the key and store it somewhere secure. It is shown only once.
+
+3. **Verify the endpoint is reachable.**
+
+   ```shell
+   curl -H "Authorization: Bearer <API_KEY>" https://metrics.temporal.io/v1/metrics
+   ```
+
+   You should see OpenMetrics-formatted output beginning with `# TYPE temporal_cloud_v1_...`.
+
+   :::note `metrics.temporal.io` is for scrapers, not browsers
+
+   The endpoint requires an `Authorization: Bearer <API key>` header on every request. There is no browser UI. Opening `https://metrics.temporal.io` or `https://metrics.temporal.io/v1/metrics` directly in a browser returns `Jwt is missing`. Configure the endpoint inside your observability tool instead.
+
+   :::
+
+4. **Configure your observability tool.**
+
+   Paste the API key into the integration for your tool of choice. See [Metrics integrations](/cloud/metrics/openmetrics/metrics-integrations) for tool-specific setup:
+
+   - [Datadog](/cloud/metrics/openmetrics/metrics-integrations#datadog)
+   - [Grafana Cloud](/cloud/metrics/openmetrics/metrics-integrations#grafana-cloud)
+   - [New Relic](/cloud/metrics/openmetrics/metrics-integrations#new-relic)
+   - [ClickStack](/cloud/metrics/openmetrics/metrics-integrations#clickstack)
+   - [Self-hosted Prometheus or OpenTelemetry Collector](/cloud/metrics/openmetrics/metrics-integrations#prometheus-grafana)
+
+Metrics begin populating in your tool within a few minutes.
+
 ## Quick Links
 * [Integrations](/cloud/metrics/openmetrics/metrics-integrations) - Get started exporting metrics with common integrations
 * [API Documentation](/cloud/metrics/openmetrics/api-reference) - Endpoint specification and advanced configuration
 * [Metrics Reference](/cloud/metrics/openmetrics/metrics-reference) - Complete catalog of all metrics with descriptions and labels
-* [Migration Guide](/cloud/metrics/openmetrics/migration-guide) - Guide on how to transition from the Prometheus query endpoint
+* [Migration Guide](/cloud/metrics/openmetrics/migration-guide) - Transition from the deprecated Prometheus query endpoint to OpenMetrics
 
 ## Overview
 Temporal Cloud OpenMetrics exposes 50+ metrics covering workflow lifecycles, task queue operations, service performance, and system limits. All metrics are aggregated over one-minute windows and available for scraping within three minutes. Each scrape returns only the most recently completed one-minute window—configure your monitoring system to retain what it scrapes.
@@ -45,7 +90,7 @@ Temporal Cloud OpenMetrics exposes 50+ metrics covering workflow lifecycles, tas
 * Teams using the query endpoint should review the [migration guide](/cloud/metrics/openmetrics/migration-guide).
 
 ## API key authentication
-Create a [service account](/cloud/metrics/openmetrics/migration-guide#create-an-api-key) with the "Metrics Read-Only" role, generate an API key, and start scraping immediately - no certificate rotation or distribution required.
+Create a Service Account with the "Metrics Read-Only" role and generate an API key. See the [Quickstart](#quickstart) above for step-by-step instructions. API keys work with standard HTTPS, with no certificate rotation or distribution required.
 
 ## Global endpoint
 This is a single endpoint at `metrics.temporal.io` which serves all metrics across your entire account with API key authentication and standard HTTPS.

@@ -30,26 +30,74 @@ This document is for basic configuration only. For advanced concepts such as lab
 
 ## Integrations
 
+Before configuring any integration, complete the [Quickstart](/cloud/metrics/openmetrics#quickstart) to create a Service Account with the **Metrics Read-Only** role and generate an API key. This requires the **Account Owner** or **Global Admin** role - a Namespace Admin cannot grant the Metrics Read-Only role.
+
 ### Datadog
 
-Datadog provides a serverless integration with the OpenMetrics endpoint. This integration will scrape metrics, store them in Datadog, and provides a default dashboard with some built in monitors. See the [integration page](https://docs.datadoghq.com/integrations/temporal-cloud-openmetrics/) for more details.
+Datadog provides a serverless integration with the OpenMetrics endpoint. It scrapes metrics, stores them in Datadog, and ships a default dashboard with built-in monitors.
+
+1. In Datadog, open the [Integrations catalog](https://app.datadoghq.com/integrations) and search for **Temporal Cloud OpenMetrics**. Install the integration.
+2. Click **Add Account** in the integration tile and paste your Temporal Cloud API key into the **API Key** field.
+3. Save the configuration. The default Temporal Cloud dashboard appears in **Dashboards → Dashboards List** once data starts flowing (typically within a few minutes).
+
+For Datadog-side details, see the [Datadog integration page](https://docs.datadoghq.com/integrations/temporal-cloud-openmetrics/).
 
 ### Grafana Cloud
 
-Grafana provides a serverless integration with the OpenMetrics endpoint for Grafana Cloud. This integration will scrape metrics, store them in Grafana Cloud, and provides a default dashboard
-for visualizing the metrics in Grafana Cloud. See the [integration page](https://grafana.com/docs/grafana-cloud/monitor-infrastructure/integrations/integration-reference/integration-temporal/)
- for more details.
+Grafana Cloud provides a serverless integration with the OpenMetrics endpoint. It scrapes metrics, stores them in Grafana Cloud, and ships a default dashboard for visualizing them.
+
+1. In Grafana Cloud, go to **Connections → Add new connection** and search for **Temporal Cloud**.
+2. On the integration page, paste your Temporal Cloud API key into the **API Key** field.
+3. Add `metrics.temporal.io` to **Allowed hosts** so Grafana Cloud can reach the endpoint.
+4. Click **Install** to enable the integration and import the pre-built dashboard.
+
+If the dashboard shows no data after a few minutes, confirm the API key's Service Account has the **Metrics Read-Only** role and that the endpoint is reachable using the `curl` check from the [Quickstart](/cloud/metrics/openmetrics#quickstart).
+
+For Grafana-side details, see the [Grafana Cloud integration page](https://grafana.com/docs/grafana-cloud/monitor-infrastructure/integrations/integration-reference/integration-temporal/).
 
 ### ClickStack
 
 ClickHouse provides an integration with the OpenMetrics endpoint for ClickStack. This integration uses an OpenTelemetry collector to read from the OpenMetrics endpoint, ingest data into ClickHouse, and
 includes a default dashboard to visualize the data with HyperDX. See the [integration page](https://clickhouse.com/docs/use-cases/observability/clickstack/integrations/temporal-metrics) for more details.
 
+1. Save your Temporal Cloud API key to a local file named `temporal.key` (no trailing newline or spaces).
+2. Create an OpenTelemetry collector config named `temporal-metrics.yaml` that uses a Prometheus receiver against `metrics.temporal.io` with Bearer token auth, a 60-second scrape interval, the `service.name: "temporal"` resource attribute, and the ClickHouse exporter. Copy the full template from the [ClickStack integration page](https://clickhouse.com/docs/use-cases/observability/clickstack/integrations/temporal-metrics).
+3. Mount both files into your ClickStack collector and set the custom config env var. With Docker Compose:
+
+   ```yaml
+   volumes:
+     - ./temporal-metrics.yaml:/etc/otelcol-contrib/custom.config.yaml
+     - ./temporal.key:/etc/otelcol-contrib/temporal.key
+   environment:
+     CUSTOM_OTELCOL_CONFIG_FILE: /etc/otelcol-contrib/custom.config.yaml
+   ```
+
+4. In HyperDX, open the **Metrics explorer** and confirm metrics with the `temporal` prefix are arriving.
+5. Import the pre-built dashboard: in HyperDX click **Import Dashboard**, upload `temporal-metrics-dashboard.json` from the ClickStack integration page, then click **Finish Import**.
+
 ### New Relic
 
-New Relic integrates with Temporal Cloud via the infrastructure agent using a flex integration that pulls data from the OpenMetrics endpoint. See the [integration page](https://docs.newrelic.com/docs/infrastructure/host-integrations/host-integrations-list/temporal-cloud-integration/) for more details.
+The New Relic integration pulls metrics from the OpenMetrics endpoint via the `nri-flex` integration that runs alongside the New Relic infrastructure agent.
+
+:::note Requires a host
+
+The integration runs on a host (Linux, Windows, or Kubernetes) with the New Relic infrastructure agent installed. The agent scrapes the endpoint and forwards metrics to New Relic.
+
+:::
+
+1. Install the **New Relic infrastructure agent** on a host. See the [agent install docs](https://docs.newrelic.com/docs/infrastructure/install-infrastructure-agent/get-started/install-infrastructure-agent/) for platform-specific instructions.
+2. Create `/etc/newrelic-infra/integrations.d/nri-flex-temporal-cloud-config.yml` using the template from the [New Relic integration page](https://docs.newrelic.com/docs/infrastructure/host-integrations/host-integrations-list/temporal-cloud-integration/), and replace the `${TEMPORAL_API_KEY}` placeholder with your Temporal Cloud API key.
+3. Restart the agent so the new config is picked up:
+
+   ```shell
+   sudo systemctl restart newrelic-infra.service
+   ```
+
+4. In **one.newrelic.com**, go to **Integrations & Agents → Dashboards**, search for **Temporal Cloud**, and install the pre-built dashboard. Data appears within a few minutes.
+
+For New Relic-side details, see the [New Relic integration page](https://docs.newrelic.com/docs/infrastructure/host-integrations/host-integrations-list/temporal-cloud-integration/).
 
-### Prometheus \+ Grafana
+### Prometheus \+ Grafana {#prometheus-grafana}
 
 Self hosted Prometheus can be used to scrape the OpenMetrics endpoint.
 

@@ -94,7 +94,7 @@ gRPC requests received per second.
 
 #### temporal\_cloud\_v1\_service\_request\_throttled\_count
 
-gRPC requests throttled per second.
+gRPC requests throttled per second. See [Monitoring Trends Against Limits](/cloud/service-health#rps-aps-rate-limits) for guidance on setting alert thresholds against the corresponding limit metric.
 
 | Label | Description |
 | ----- | ----- |
@@ -124,7 +124,9 @@ The number of pollers that are actively long polling for a task. Use this to tra
 
 #### temporal\_cloud\_v1\_resource\_exhausted\_error\_count
 
-Resource exhaustion errors per second. This metric does not include throttling due to Namespace limits.
+Resource exhaustion errors per second, incremented when a single resource receives a burst larger than it can absorb. SDKs retry these errors gracefully. This metric does not include throttling due to Namespace limits - see [`temporal_cloud_v1_total_action_throttled_count`](#temporal_cloud_v1_total_action_throttled_count) and related throttle metrics for rate limiting against account limits.
+
+See [Detecting Resource Exhaustion](/cloud/service-health#detecting-resource-exhaustion) for guidance on investigating non-zero values.
 
 | Label | Description |
 | ----- | ----- |
@@ -633,7 +635,7 @@ This metric could have high cardinality depending on number of action types and
 
 #### temporal\_cloud\_v1\_total\_action\_throttled\_count
 
-The total number of actions throttled per second.
+The total number of actions throttled per second. See [Monitoring Trends Against Limits](/cloud/service-health#rps-aps-rate-limits) for guidance on setting alert thresholds against the corresponding limit metric.
 
 **Type**: Rate
 
@@ -651,7 +653,7 @@ Operations performed per second.
 
 #### temporal\_cloud\_v1\_operations\_throttled\_count
 
-Operations throttled due to rate limits per second.
+Operations throttled due to rate limits per second. See [Monitoring Trends Against Limits](/cloud/service-health#rps-aps-rate-limits) for guidance on setting alert thresholds against the corresponding limit metric.
 
 | Label | Description |
 | ----- | ----- |

@@ -146,13 +146,16 @@ See [operations and metrics](/cloud/high-availability) for Namespaces with High
 
 ## Detecting Resource Exhaustion
 
-The Cloud metric `temporal_cloud_v1_resource_exhausted_error_count` is the primary indicator for Cloud-side throttling, signaling system limits
-are exceeded and `ResourceExhausted` gRPC errors are occurring. This generally does not break workflow processing due to how resources are prioritized.
+Resource exhaustion happens when a single resource (a Namespace, Task Queue, or Workflow ID) receives a burst of operations larger than that resource can absorb in the moment. The Cloud metric `temporal_cloud_v1_resource_exhausted_error_count` increments and `ResourceExhausted` gRPC errors are returned to the client. SDKs retry these errors gracefully, so workflow progress is rarely impacted.
 
-Persistent non-zero values of this metric are unexpected.
+Persistent non-zero values are unexpected and indicate a hot resource. Use the `operation` label to identify which RPC is hitting the burst limit. For example, `StartWorkflowExecution` increments here when the same Workflow ID is started more than once per second.
+
+Resource exhaustion is distinct from rate limiting against your account limits. For workloads that are throttled because they exceed their provisioned capacity, see [Monitoring Trends Against Limits](#rps-aps-rate-limits). Limits-driven throttling slows or stalls a workload, so it is generally the more important signal to monitor.
 
 ## Monitoring Trends Against Limits {#rps-aps-rate-limits}
 
+Tracking trends against your account limits is the most important throttling signal to monitor. Unlike [Resource Exhaustion](#detecting-resource-exhaustion), which usually self-heals through retries, hitting a limit slows or stalls progress until the workload backs off or your capacity is increased.
+
 The set of [limit metrics](/cloud/metrics/openmetrics/metrics-reference#limit-metrics) provide a time series of values for limits. Use these
 metrics with their corresponding count metrics to monitor general trends against limits and set alerts when limits are exceeded. Use the corresponding throttle metrics
 to determine the severity of any active rate limiting.