Skip to content
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/setup-robusta/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@
tuning-performance
configuration-secrets
openshift
read-only-service-account
rbac-namespace-scoping
node-selector
proxies
privacy-and-security
Expand Down
35 changes: 31 additions & 4 deletions docs/setup-robusta/openshift.rst
Original file line number Diff line number Diff line change
Expand Up @@ -49,11 +49,38 @@ Some lesser used Robusta Classic features require more permissions than the base

In order to support the ``python_debugger``, ``java_debugger`` and ``node_disk_analyzer``
playbooks, permission to run a far more privileged container needs to be granted to
the ``runner`` service account. This container has ``SYS_ADMIN`` capabilities and must
run as root on the node.
the ``runner`` service account. This container runs privileged with the ``SYS_ADMIN`` and
``SYS_PTRACE`` capabilities. The privileged SCC uses ``runAsUser: RunAsAny``, so it does not force
a specific user; the debug container typically runs as root in order to attach to and inspect other
processes on the node.

**Important**: These capabilities are **OPTIONAL** and only needed for the native debugging features mentioned above. Most Robusta deployments work fine with the baseline SCC.

Baseline SCC is Sufficient For:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- ✅ All investigations and diagnostics
- ✅ KRR scans (resource right-sizing)
- ✅ Popeye scans (cluster analysis)
- ✅ Log analysis and enrichment
- ✅ Metrics and event analysis
- ✅ Alert correlation
- ✅ Pod restart and scaling
- ✅ Deployment patching
- ✅ All standard playbooks

Privileged SCC Only Needed For:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- ❌ Python debugger (``python_debugger`` playbook)
- ❌ Java debugger (``java_debugger`` playbook)
- ❌ Node disk analyzer (``node_disk_analyzer`` playbook)

Enabling the Privileged SCC
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

To support these features in a production environment, you may want to only temporarily
enable this permission so that a normal request cannot bypass the the less permissive SCC found
enable this permission so that a normal request cannot bypass the less permissive SCC found
in the baseline. To enable these privileged operations in your OpenShift environment,
update the ``generated_values.yaml`` as follows:

Expand All @@ -62,7 +89,7 @@ update the ``generated_values.yaml`` as follows:
openshift:
enabled: true
createScc: true
createPrivilegedScc: true
createPrivilegedScc: true # Optional - only if you need debugging features

You may also reference an existing SCC using the ``openshift.privilegedSccName`` value.
In test environments, you can reference the ``privileged`` SCC to enable these features in your
Expand Down
53 changes: 53 additions & 0 deletions docs/setup-robusta/rbac-namespace-scoping.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
.. _rbac-namespace-scoping:

RBAC: Namespace-Scoped Deployments
========================================

By default, Robusta and HolmesGPT use cluster-wide RBAC: they are granted a ``ClusterRole`` bound with a
``ClusterRoleBinding``, so they can read (and, for the runner, act on) resources in every namespace.

In restricted environments you may want to limit what HolmesGPT can see to a specific set of namespaces.
This guide explains how, and what the trade-offs are.

Scoping HolmesGPT to Specific Namespaces
----------------------------------------

Set ``holmes.overrideClusterRoles`` to the list of namespaces HolmesGPT is allowed to access. Instead of a
cluster-wide ``ClusterRoleBinding``, the chart then creates a namespaced ``RoleBinding`` in each listed
namespace (reusing the same ClusterRole for its rules):

.. code-block:: yaml

holmes:
overrideClusterRoles:
- default
- monitoring

When this list is empty (the default), HolmesGPT keeps its cluster-wide binding — existing installs are
unaffected.

.. important::

- The listed namespaces **must already exist**; the chart does not create them.
- Access is limited to **namespaced** resources in those namespaces. **Cluster-scoped** resources
(for example ``nodes``, ``persistentvolumes``, cluster-level events) are no longer readable, so
tools that rely on them (node health, cluster-wide resource views) will not work.
Comment thread
Avi-Robusta marked this conversation as resolved.
Outdated

Verifying the Scope
-------------------

.. code-block:: bash

SA=system:serviceaccount:<release-namespace>:robusta-holmes-service-account
Comment thread
Avi-Robusta marked this conversation as resolved.
Outdated

kubectl auth can-i list pods --as=$SA -n default # -> yes
kubectl auth can-i list pods --as=$SA -n monitoring # -> yes
kubectl auth can-i list pods --as=$SA -n kube-system # -> no

Notes on the Runner
-------------------

The Robusta runner remains cluster-wide. To reduce the runner's permissions, use
:ref:`a read-only ClusterRole <read-only-service-account>` via ``runner.overrideClusterRoles``.
Fully scoping the runner to a subset of namespaces is not supported through Helm values, because the
runner watches cluster-wide resources and events to function.
257 changes: 257 additions & 0 deletions docs/setup-robusta/read-only-service-account.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,257 @@
.. _read-only-service-account:

Read-Only Service Account
========================================

By default, Robusta's runner service account has permissions to create, update, and delete Kubernetes resources. This guide explains how to restrict the runner to read-only permissions for environments where you want to prevent any modifications to cluster resources.

Why Read-Only Mode?
-------------------

Read-only mode is useful in scenarios where you want to:

- **Prevent accidental modifications**: Ensure that even if a playbook or investigation logic has a bug, no cluster resources will be modified
- **Comply with security policies**: Meet organizational requirements for read-only access in certain environments
- **Prevent node operations**: Prevent users from draining or restarting nodes through investigations
- **Audit-only mode**: Run Holmes for investigation and diagnostics without remediation capabilities

Limitations of Read-Only Mode
-----------------------------

When using read-only permissions, the following Robusta features will not be available:

- **Auto-remediation**: Playbooks that automatically fix issues (restart pods, scale deployments, drain nodes, etc.)
- **Silence management**: Creating or deleting alert silences
- **Pod debugging**: Live debugging tools that require container execution
- **Resource modification**: Any playbook or action that modifies Kubernetes resources

These features require write permissions and will gracefully fail if attempted with read-only service account.

**Read-only mode is ideal for**: Investigation, diagnostics, log analysis, metric enrichment, and reporting.

Implementation: Using overrideClusterRoles
-------------------------------------------

Robusta's Helm chart supports the ``runner.overrideClusterRoles`` parameter. When set, the rules you
provide **fully replace** the built-in runner ClusterRole rules, so only the permissions you list are granted.

.. note::

Do not confuse this with ``runner.customClusterRoleRules``. That parameter *adds* rules on top of the
built-in rules (which include write verbs), so it **cannot** be used to make the runner read-only.
Use ``runner.overrideClusterRoles`` for read-only mode.

To use read-only mode, create a custom values file with the following configuration:

.. code-block:: yaml

runner:
overrideClusterRoles:
# Core API resources - read-only
- apiGroups:
- ""
resources:
- configmaps
- daemonsets
- deployments
- events
- namespaces
- persistentvolumes
- persistentvolumeclaims
- pods
- pods/status
- pods/log
- replicasets
- replicationcontrollers
- services
- serviceaccounts
- endpoints
verbs:
- get
- list
- watch
Comment thread
coderabbitai[bot] marked this conversation as resolved.

# Nodes - read-only
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- list
- watch

# Apps API - read-only
- apiGroups:
- apps
resources:
- daemonsets
- deployments
- deployments/scale
- replicasets
- replicasets/scale
- statefulsets
verbs:
- get
- list
- watch

# Batch API - read-only
- apiGroups:
- batch
resources:
- cronjobs
- jobs
verbs:
- get
- list
- watch

# Autoscaling - read-only
- apiGroups:
- autoscaling
resources:
- horizontalpodautoscalers
verbs:
- get
- list
- watch

# RBAC - read-only
- apiGroups:
- rbac.authorization.k8s.io
resources:
- clusterroles
- clusterrolebindings
- roles
- rolebindings
verbs:
- get
- list
- watch

# Networking - read-only
- apiGroups:
- networking.k8s.io
resources:
- ingresses
- networkpolicies
verbs:
- get
- list
- watch

# Events - read-only
- apiGroups:
- events.k8s.io
resources:
- events
verbs:
- get
- list

# CRDs - read-only
- apiGroups:
- apiextensions.k8s.io
resources:
- customresourcedefinitions
verbs:
- list
- get

# API Registration - read-only
- apiGroups:
- apiregistration.k8s.io
resources:
- apiservices
verbs:
- get
- list

# Policy - read-only
- apiGroups:
- policy
resources:
- poddisruptionbudgets
- podsecuritypolicies
verbs:
- get
- list

# Monitoring (optional) - read-only
- apiGroups:
- monitoring.coreos.com
resources:
- prometheusrules
- servicemonitors
- podmonitors
- alertmanagers
- silences
verbs:
- get
- list
- watch

# Argo CD (optional) - read-only
- apiGroups:
- argoproj.io
resources:
- applications
- applicationsets
- appprojects
- workflows
- workflowtemplates
- cronworkflows
- rollouts
- analysisruns
- analysistemplates
- experiments
verbs:
- get
- list
- watch

Then install or upgrade Robusta with this values file:

.. code-block:: bash

helm upgrade --install robusta robusta/robusta \
-f generated_values.yaml \
-f read-only-values.yaml \
-n robusta-system --create-namespace

Verifying Read-Only Permissions
--------------------------------

After installation, verify that the runner service account has only read permissions. The ClusterRole is
cluster-scoped, so no namespace flag is needed:

.. code-block:: bash

# Inspect the ClusterRole - the rules should only contain "get", "list", "watch" verbs,
# and NOT "create", "delete", "patch", or "update".
kubectl describe clusterrole robusta-runner-cluster-role

Testing Write Protection
------------------------

Use ``kubectl auth can-i`` to confirm what the runner service account can and cannot do
(replace ``robusta-system`` with your release namespace):

.. code-block:: bash

SA=system:serviceaccount:robusta-system:robusta-runner-service-account

kubectl auth can-i list pods --as=$SA -n default # -> yes
kubectl auth can-i delete pods --as=$SA -n default # -> no
kubectl auth can-i patch deployments --as=$SA -n default # -> no
kubectl auth can-i create pods/exec --as=$SA -n default # -> no

The read verbs should return ``yes`` while all write/exec verbs return ``no``, confirming the runner is read-only.

Notes and Recommendations
--------------------------

- **CRD Permissions**: If you have custom operators (Argo, Flux, Kafka, KEDA, etc.), add their CRD groups to the read-only rules above with only ``get``, ``list``, ``watch`` verbs
- **Performance**: Read-only mode may improve performance slightly since no write operations are performed
- **Logging**: Monitor Robusta logs for any "permission denied" errors to identify features that require write access
4 changes: 4 additions & 0 deletions helm/robusta/templates/runner-service-account.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@ metadata:
name: {{ include "robusta.fullname" . }}-runner-cluster-role
namespace : {{ .Release.Namespace }}
rules:
{{- if .Values.runner.overrideClusterRoles }}
{{ toYaml .Values.runner.overrideClusterRoles | indent 2 }}
{{- else }}
{{- if .Values.runner.customClusterRoleRules }}
{{ toYaml .Values.runner.customClusterRoleRules | indent 2 }}
{{- end }}
Expand Down Expand Up @@ -543,6 +546,7 @@ rules:
- list
- watch
{{- end }}
{{- end }} {{/* end of overrideClusterRoles if/else — when overrideClusterRoles is set, it replaces the built-in rules */}}

---
apiVersion: v1
Expand Down
Loading
Loading