Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion docs/setup-robusta/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,10 @@
tuning-performance
configuration-secrets
openshift
read-only-service-account
node-selector
proxies
privacy-and-security
installation-faq



35 changes: 31 additions & 4 deletions docs/setup-robusta/openshift.rst
Original file line number Diff line number Diff line change
Expand Up @@ -49,11 +49,38 @@ Some lesser used Robusta Classic features require more permissions than the base

In order to support the ``python_debugger``, ``java_debugger`` and ``node_disk_analyzer``
playbooks, permission to run a far more privileged container needs to be granted to
the ``runner`` service account. This container has ``SYS_ADMIN`` capabilities and must
run as root on the node.
the ``runner`` service account. This container runs privileged with the ``SYS_ADMIN`` and
``SYS_PTRACE`` capabilities. The privileged SCC uses ``runAsUser: RunAsAny``, so it does not force
a specific user; the debug container typically runs as root in order to attach to and inspect other
processes on the node.

**Important**: These capabilities are **OPTIONAL** and only needed for the native debugging features mentioned above. Most Robusta deployments work fine with the baseline SCC.

Baseline SCC is Sufficient For:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- ✅ All investigations and diagnostics
- ✅ KRR scans (resource right-sizing)
- ✅ Popeye scans (cluster analysis)
- ✅ Log analysis and enrichment
- ✅ Metrics and event analysis
- ✅ Alert correlation
- ✅ Pod restart and scaling
- ✅ Deployment patching
- ✅ All standard playbooks

Privileged SCC Only Needed For:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- ❌ Python debugger (``python_debugger`` playbook)
- ❌ Java debugger (``java_debugger`` playbook)
- ❌ Node disk analyzer (``node_disk_analyzer`` playbook)

Enabling the Privileged SCC
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

To support these features in a production environment, you may want to only temporarily
enable this permission so that a normal request cannot bypass the the less permissive SCC found
enable this permission so that a normal request cannot bypass the less permissive SCC found
in the baseline. To enable these privileged operations in your OpenShift environment,
update the ``generated_values.yaml`` as follows:

Expand All @@ -62,7 +89,7 @@ update the ``generated_values.yaml`` as follows:
openshift:
enabled: true
createScc: true
createPrivilegedScc: true
createPrivilegedScc: true # Optional - only if you need debugging features

You may also reference an existing SCC using the ``openshift.privilegedSccName`` value.
In test environments, you can reference the ``privileged`` SCC to enable these features in your
Expand Down
257 changes: 257 additions & 0 deletions docs/setup-robusta/read-only-service-account.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,257 @@
.. _read-only-service-account:

Read-Only Service Account
========================================

By default, Robusta's runner service account has permissions to create, update, and delete Kubernetes resources. This guide explains how to restrict the runner to read-only permissions for environments where you want to prevent any modifications to cluster resources.

Why Read-Only Mode?
-------------------

Read-only mode is useful in scenarios where you want to:

- **Prevent accidental modifications**: Ensure that even if a playbook or investigation logic has a bug, no cluster resources will be modified
- **Comply with security policies**: Meet organizational requirements for read-only access in certain environments
- **Prevent node operations**: Prevent users from draining or restarting nodes through investigations
- **Audit-only mode**: Run Holmes for investigation and diagnostics without remediation capabilities

Limitations of Read-Only Mode
-----------------------------

When using read-only permissions, the following Robusta features will not be available:

- **Auto-remediation**: Playbooks that automatically fix issues (restart pods, scale deployments, drain nodes, etc.)
- **Silence management**: Creating or deleting alert silences
- **Pod debugging**: Live debugging tools that require container execution
- **Resource modification**: Any playbook or action that modifies Kubernetes resources

These features require write permissions and will gracefully fail if attempted with read-only service account.

**Read-only mode is ideal for**: Investigation, diagnostics, log analysis, metric enrichment, and reporting.

Implementation: Using overrideClusterRoles
-------------------------------------------

Robusta's Helm chart supports the ``runner.overrideClusterRoles`` parameter. When set, the rules you
provide **fully replace** the built-in runner ClusterRole rules, so only the permissions you list are granted.

.. note::

Do not confuse this with ``runner.customClusterRoleRules``. That parameter *adds* rules on top of the
built-in rules (which include write verbs), so it **cannot** be used to make the runner read-only.
Use ``runner.overrideClusterRoles`` for read-only mode.

To use read-only mode, create a custom values file with the following configuration:

.. code-block:: yaml

runner:
overrideClusterRoles:
# Core API resources - read-only
- apiGroups:
- ""
resources:
- configmaps
- daemonsets
- deployments
- events
- namespaces
- persistentvolumes
- persistentvolumeclaims
- pods
- pods/status
- pods/log
- replicasets
- replicationcontrollers
- services
- serviceaccounts
- endpoints
verbs:
- get
- list
- watch
Comment thread
coderabbitai[bot] marked this conversation as resolved.

# Nodes - read-only
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- list
- watch

# Apps API - read-only
- apiGroups:
- apps
resources:
- daemonsets
- deployments
- deployments/scale
- replicasets
- replicasets/scale
- statefulsets
verbs:
- get
- list
- watch

# Batch API - read-only
- apiGroups:
- batch
resources:
- cronjobs
- jobs
verbs:
- get
- list
- watch

# Autoscaling - read-only
- apiGroups:
- autoscaling
resources:
- horizontalpodautoscalers
verbs:
- get
- list
- watch

# RBAC - read-only
- apiGroups:
- rbac.authorization.k8s.io
resources:
- clusterroles
- clusterrolebindings
- roles
- rolebindings
verbs:
- get
- list
- watch

# Networking - read-only
- apiGroups:
- networking.k8s.io
resources:
- ingresses
- networkpolicies
verbs:
- get
- list
- watch

# Events - read-only
- apiGroups:
- events.k8s.io
resources:
- events
verbs:
- get
- list

# CRDs - read-only
- apiGroups:
- apiextensions.k8s.io
resources:
- customresourcedefinitions
verbs:
- list
- get

# API Registration - read-only
- apiGroups:
- apiregistration.k8s.io
resources:
- apiservices
verbs:
- get
- list

# Policy - read-only
- apiGroups:
- policy
resources:
- poddisruptionbudgets
- podsecuritypolicies
verbs:
- get
- list

# Monitoring (optional) - read-only
- apiGroups:
- monitoring.coreos.com
resources:
- prometheusrules
- servicemonitors
- podmonitors
- alertmanagers
- silences
verbs:
- get
- list
- watch

# Argo CD (optional) - read-only
- apiGroups:
- argoproj.io
resources:
- applications
- applicationsets
- appprojects
- workflows
- workflowtemplates
- cronworkflows
- rollouts
- analysisruns
- analysistemplates
- experiments
verbs:
- get
- list
- watch

Then install or upgrade Robusta with this values file:

.. code-block:: bash

helm upgrade --install robusta robusta/robusta \
-f generated_values.yaml \
-f read-only-values.yaml \
-n robusta-system --create-namespace

Verifying Read-Only Permissions
--------------------------------

After installation, verify that the runner service account has only read permissions. The ClusterRole is
cluster-scoped, so no namespace flag is needed:

.. code-block:: bash

# Inspect the ClusterRole - the rules should only contain "get", "list", "watch" verbs,
# and NOT "create", "delete", "patch", or "update".
kubectl describe clusterrole robusta-runner-cluster-role

Testing Write Protection
------------------------

Use ``kubectl auth can-i`` to confirm what the runner service account can and cannot do
(replace ``robusta-system`` with your release namespace):

.. code-block:: bash

SA=system:serviceaccount:robusta-system:robusta-runner-service-account

kubectl auth can-i list pods --as=$SA -n default # -> yes
kubectl auth can-i delete pods --as=$SA -n default # -> no
kubectl auth can-i patch deployments --as=$SA -n default # -> no
kubectl auth can-i create pods/exec --as=$SA -n default # -> no

The read verbs should return ``yes`` while all write/exec verbs return ``no``, confirming the runner is read-only.

Notes and Recommendations
--------------------------

- **CRD Permissions**: If you have custom operators (Argo, Flux, Kafka, KEDA, etc.), add their CRD groups to the read-only rules above with only ``get``, ``list``, ``watch`` verbs
- **Performance**: Read-only mode may improve performance slightly since no write operations are performed
- **Logging**: Monitor Robusta logs for any "permission denied" errors to identify features that require write access
4 changes: 4 additions & 0 deletions helm/robusta/templates/runner-service-account.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@ metadata:
name: {{ include "robusta.fullname" . }}-runner-cluster-role
namespace : {{ .Release.Namespace }}
rules:
{{- if .Values.runner.overrideClusterRoles }}
{{ toYaml .Values.runner.overrideClusterRoles | indent 2 }}
{{- else }}
{{- if .Values.runner.customClusterRoleRules }}
{{ toYaml .Values.runner.customClusterRoleRules | indent 2 }}
{{- end }}
Expand Down Expand Up @@ -543,6 +546,7 @@ rules:
- list
- watch
{{- end }}
{{- end }} {{/* end of overrideClusterRoles if/else — when overrideClusterRoles is set, it replaces the built-in rules */}}

---
apiVersion: v1
Expand Down
4 changes: 4 additions & 0 deletions helm/robusta/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -744,7 +744,11 @@ runner:
tolerations: []
annotations: {}
nodeSelector: ~
# Legacy: rules here are ADDED to the built-in runner ClusterRole rules (backwards compatible).
customClusterRoleRules: []
# When set, these rules fully REPLACE the built-in runner ClusterRole rules (built-ins and
# customClusterRoleRules are omitted). Use for a read-only runner. Empty = built-in behavior.
overrideClusterRoles: []
# set to override global.imagePullSecrets for the runner; leave empty to inherit the global
imagePullSecrets: []
extraVolumes: []
Expand Down
Loading