How to use Pod Security Standards in Kubernetes Cluster


Here in this article we will see how we can enable and use Pod Security Standards in Kubernetes Cluster environment. We will be enabling the Pod Security Admission Controller in Kubernetes API server and create namespace which will enforce the Pod Security Standard isolation levels. Also we will see how we can enable audit logging of these security restriction event in kubernetes cluster by enabling audit log policy in Kubernetes Cluster.

Test Environment

Fedora 36 server
Kubernetes Cluster v1.25.2 (1 master + 1 worker node)

What are Pod Security Standards

Pod Security Standards play an important role is configuring the Kubernetes Cluster security at the Pod and Container level. Its provides a set of policies that when used provide the Pods and their respective containers with access to host from Highly-permissive to Highly-restrictive.

PrivilegedUnrestricted policy, providing the widest possible level of permissions. This policy allows for known privilege escalations.
BaselineMinimally restrictive policy which prevents known privilege escalations. Allows the default (minimally specified) Pod configuration.
RestrictedHeavily restricted policy, following current Pod hardening best practices.

Kubernetes offers a built-in Pod Security admission controller to enforce the Pod Security Standards. Pod security restrictions are applied at the namespace level when pods are created.


Step1: Enable Pod Security Admission Controller for Pod Security Standards

As a first step we need to enable the PodSecurity feature gate in the kube-apiserver manifest as shown below. For v1.25.x PodSecurity feature gate is enabled by default as per the documentation.

For a kubernetes cluster with version < 1.22.x, we need to enable this feature gate explicitly as shown below. Once the update is done, the API server pod will be restarted.

[root@kubemaster manifests]# cat kube-apiserver.yaml 
apiVersion: v1
kind: Pod
  creationTimestamp: null
    component: kube-apiserver
    tier: control-plane
  name: kube-apiserver
  namespace: kube-system
    - --tls-private-key-file=/etc/kubernetes/pki/apiserver.key
    - --feature-gates=PodSecurity=true

Once the Pod Security Admission Controller is enabled it places requirements on a Pod’s Security Context and other related fields according to the three levels defined by the Pod Security Standards: privileged, baseline, and restricted.

Step2: Create a new namespace with control mode and isolation level

As mentioned above, Pod security restrictions are applied at the namespace level when pods are created.

First we need to decide on the isolation level that we want to use for the Pod (ie. privileged, baseline and restricted). Once that is decided we need to decide on the control mode that we want set at the namespace level (ie. enforce, audit, warn). Whenever there is a pod violates any of the policy restrictions, based on the control mode that particular action will either be rejected, logged as an audit event in audit.log or trigger a warning for the end user. Here are the details on the control mode that we can use for the admission controller.

enforcePolicy violations will cause the pod to be rejected.
auditPolicy violations will trigger the addition of an audit annotation to the event recorded in the audit log, but are otherwise allowed.
warnPolicy violations will trigger a user-facing warning, but are otherwise allowed.

In this step we will create a new namespace – podsecurityns for testing the pod security standard related features and restrictions. We will be using enforce mode with restricted level for this namespace. For this mode and level to take effect we need to define a set of labels for the namespace as shown below.

Here is the yaml definition file that we will apply to create a namespace with pod security restriction and enforcement mode labels as shown below.

[admin@kubemaster podsecuritystandards]$ cat createpodsecurityns.yml 
apiVersion: v1
kind: Namespace
  creationTimestamp: null
  name: podsecurityns
  labels: restricted latest restricted latest restricted latest
spec: {}
status: {}

Now let’s apply this yaml definition file to create the secure namespace.

[admin@kubemaster podsecuritystandards]$ kubectl apply -f createpodsecurityns.yml 
namespace/podsecurityns created

Step3: Create a Pod violating a security restriction

Now let us try to create a pod with securityContext for allowing Privileged Escalation as true in our secure namespace – podsecurityns.

[admin@kubemaster podsecuritystandards]$ cat privilegedpod.yml 
apiVersion: v1
kind: Pod
  name: privilegedpod-demo
  namespace: podsecurityns
  - name: privilegedpod-demo
    image: busybox:1.28
    command: [ "sh", "-c", "sleep 1h" ]
      allowPrivilegeEscalation: true

As you can see from the below error, we are unable to create the pod as it violates the PodSecurity restricted policy that we enabled for this namespace. Also this event should be logged into the audit.log file as we enabled the control mode – audit for restricted level access.

[admin@kubemaster podsecuritystandards]$ kubectl apply -f privilegedpod.yml 
Error from server (Forbidden): error when creating "privilegedpod.yml": pods "privilegedpod-demo" is forbidden: violates PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "privilegedpod-demo" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "privilegedpod-demo" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "privilegedpod-demo" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "privilegedpod-demo" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")

But for the audit logs to be generated we need to configure our API server to use a audit policy for logging the events into the logs file on the Host machine. Follow the next step for enabling auditing in Kubernetes API server.

Step4: Enable audit logging for Kubernetes API server

Kubernetes auditing provides a security-relevant, chronological set of records documenting the sequence of actions in a cluster. The policy determines what’s recorded and the backends persist the records.

Create an audit policy yaml definition file

[root@kubemaster kubernetes]# pwd
[root@kubemaster kubernetes]# cat audit-policy.yaml 
apiVersion: # This is required.
kind: Policy
# Don't generate audit events for all requests in RequestReceived stage.
  - "RequestReceived"
  # Log pod changes at RequestResponse level
  - level: RequestResponse
    - group: ""
      # Resource "pods" doesn't match requests to any subresource of pods,
      # which is consistent with the RBAC policy.
      resources: ["pods"]
  # Log "pods/log", "pods/status" at Metadata level
  - level: Metadata
    - group: ""
      resources: ["pods/log", "pods/status"]

  # Don't log requests to a configmap called "controller-leader"
  - level: None
    - group: ""
      resources: ["configmaps"]
      resourceNames: ["controller-leader"]

  # Don't log watch requests by the "system:kube-proxy" on endpoints or services
  - level: None
    users: ["system:kube-proxy"]
    verbs: ["watch"]
    - group: "" # core API group
      resources: ["endpoints", "services"]

  # Don't log authenticated requests to certain non-resource URL paths.
  - level: None
    userGroups: ["system:authenticated"]
    - "/api*" # Wildcard matching.
    - "/version"

  # Log the request body of configmap changes in kube-system.
  - level: Request
    - group: "" # core API group
      resources: ["configmaps"]
    # This rule only applies to resources in the "kube-system" namespace.
    # The empty string "" can be used to select non-namespaced resources.
    namespaces: ["kube-system"]

  # Log configmap and secret changes in all other namespaces at the Metadata level.
  - level: Metadata
    - group: "" # core API group
      resources: ["secrets", "configmaps"]

  # Log all other resources in core and extensions at the Request level.
  - level: Request
    - group: "" # core API group
    - group: "extensions" # Version of group should NOT be included.

  # A catch-all rule to log all other requests at the Metadata level.
  - level: Metadata
    # Long-running requests like watches that fall under this rule will not
    # generate an audit event in RequestReceived.
      - "RequestReceived"

Configure kube-apiserver with the audit policy file and the backend log path location. Here is the updated kube-apiserver.yml definition file with audit logging settings enabled.

[root@kubemaster manifests]# cat kube-apiserver.yaml 
apiVersion: v1
kind: Pod
  creationTimestamp: null
    component: kube-apiserver
    tier: control-plane
  name: kube-apiserver
  namespace: kube-system
  - command:
    - kube-apiserver
    - --advertise-address=
    - --allow-privileged=true
    - --authorization-mode=Node,RBAC
    - --client-ca-file=/etc/kubernetes/pki/ca.crt
    - --enable-admission-plugins=NodeRestriction
    - --enable-bootstrap-token-auth=true
    - --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt
    - --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt
    - --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key
    - --etcd-servers=
    - --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt
    - --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key
    - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
    - --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt
    - --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key
    - --requestheader-allowed-names=front-proxy-client
    - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
    - --requestheader-extra-headers-prefix=X-Remote-Extra-
    - --requestheader-group-headers=X-Remote-Group
    - --requestheader-username-headers=X-Remote-User
    - --secure-port=6443
    - --service-account-issuer=https://kubernetes.default.svc.cluster.local
    - --service-account-key-file=/etc/kubernetes/pki/
    - --service-account-signing-key-file=/etc/kubernetes/pki/sa.key
    - --service-cluster-ip-range=
    - --tls-cert-file=/etc/kubernetes/pki/apiserver.crt
    - --tls-private-key-file=/etc/kubernetes/pki/apiserver.key
    - --audit-policy-file=/etc/kubernetes/audit-policy.yaml
    - --audit-log-path=/var/log/kubernetes/audit/audit.log
    #- --feature-gates=PodSecurity=true
    imagePullPolicy: IfNotPresent
      failureThreshold: 8
        path: /livez
        port: 6443
        scheme: HTTPS
      initialDelaySeconds: 10
      periodSeconds: 10
      timeoutSeconds: 15
    name: kube-apiserver
      failureThreshold: 3
        path: /readyz
        port: 6443
        scheme: HTTPS
      periodSeconds: 1
      timeoutSeconds: 15
        cpu: 250m
      failureThreshold: 24
        path: /livez
        port: 6443
        scheme: HTTPS
      initialDelaySeconds: 10
      periodSeconds: 10
      timeoutSeconds: 15
    - mountPath: /etc/kubernetes/audit-policy.yaml
      name: audit
      readOnly: true
    - mountPath: /var/log/kubernetes/audit/
      name: audit-log
      readOnly: false
    - mountPath: /etc/ssl/certs
      name: ca-certs
      readOnly: true
    - mountPath: /etc/pki
      name: etc-pki
      readOnly: true
    - mountPath: /etc/kubernetes/pki
      name: k8s-certs
      readOnly: true
  hostNetwork: true
  priorityClassName: system-node-critical
      type: RuntimeDefault
  - hostPath:
      path: /etc/kubernetes/audit-policy.yaml
      type: File
    name: audit
  - hostPath:
      path: /var/log/kubernetes/audit/
      type: DirectoryOrCreate
    name: audit-log
  - hostPath:
      path: /etc/ssl/certs
      type: DirectoryOrCreate
    name: ca-certs
  - hostPath:
      path: /etc/pki
      type: DirectoryOrCreate
    name: etc-pki
  - hostPath:
      path: /etc/kubernetes/pki
      type: DirectoryOrCreate
    name: k8s-certs
status: {}

Ensure that kubernetes cluster nodes are in ready state after the kube-apiserver manifest changes as shown below.

[admin@kubemaster podsecuritystandards]$ kubectl get nodes
NAME         STATUS   ROLES           AGE   VERSION
kubemaster   Ready    control-plane   20d   v1.25.2
kubenode     Ready    <none>          20d   v1.25.2

NOTE: The audit logging feature increases the memory consumption of the API server because some context required for auditing is stored for each request. Memory consumption depends on the audit logging configuration.

Now’s if you try to apply the same privilegedpod.yml from the Step3 you should be able to see an audit log event getting generated as shown below.

[root@kubemaster audit]# pwd

[root@kubemaster audit]# tail -f audit.log | grep podsecurityns
{"kind":"Event","apiVersion":"","level":"RequestResponse","auditID":"1a07f83e-2a55-4118-a49b-8d088e5f9ce0","stage":"ResponseComplete","requestURI":"/api/v1/namespaces/podsecurityns/pods/privilegedpod-demo","verb":"get","user":{"username":"kubernetes-admin","groups":["system:masters","system:authenticated"]},"sourceIPs":[""],"userAgent":"kubectl/v1.25.2 (linux/amd64) kubernetes/5835544","objectRef":{"resource":"pods","namespace":"podsecurityns","name":"privilegedpod-demo","apiVersion":"v1"},"responseStatus":{"metadata":{},"status":"Failure","message":"pods \"privilegedpod-demo\" not found","reason":"NotFound","details":{"name":"privilegedpod-demo","kind":"pods"},"code":404},"responseObject":{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods \"privilegedpod-demo\" not found","reason":"NotFound","details":{"name":"privilegedpod-demo","kind":"pods"},"code":404},"requestReceivedTimestamp":"2022-10-17T08:21:01.703142Z","stageTimestamp":"2022-10-17T08:21:01.705514Z","annotations":{"":"allow","":""}}
{"kind":"Event","apiVersion":"","level":"Request","auditID":"6783f4f8-b57b-41e8-a2e8-69e2e4477f8e","stage":"ResponseComplete","requestURI":"/api/v1/namespaces/podsecurityns","verb":"get","user":{"username":"kubernetes-admin","groups":["system:masters","system:authenticated"]},"sourceIPs":[""],"userAgent":"kubectl/v1.25.2 (linux/amd64) kubernetes/5835544","objectRef":{"resource":"namespaces","namespace":"podsecurityns","name":"podsecurityns","apiVersion":"v1"},"responseStatus":{"metadata":{},"code":200},"requestReceivedTimestamp":"2022-10-17T08:21:01.706872Z","stageTimestamp":"2022-10-17T08:21:01.709100Z","annotations":{"":"allow","":""}}
{"kind":"Event","apiVersion":"","level":"Request","auditID":"37f07ad1-4f58-4727-b74f-e04fefccf475","stage":"ResponseComplete","requestURI":"/api/v1/namespaces/podsecurityns/limitranges","verb":"list","user":{"username":"system:apiserver","uid":"ef99bcc7-150a-41b2-9dfb-07eac5a110f4","groups":["system:masters"]},"sourceIPs":["::1"],"userAgent":"kube-apiserver/v1.25.2 (linux/amd64) kubernetes/5835544","objectRef":{"resource":"limitranges","namespace":"podsecurityns","apiVersion":"v1"},"responseStatus":{"metadata":{},"code":200},"requestReceivedTimestamp":"2022-10-17T08:21:01.711588Z","stageTimestamp":"2022-10-17T08:21:01.713037Z","annotations":{"":"allow","":""}}

Step5: Create a new namespace without any Pod Security Standard restrictions

[admin@kubemaster podsecuritystandards]$ kubectl create ns standardns
namespace/standardns created

Step6: Create a Pod violating a security restriction in standardns namespace

Now let us try to create a pod with securityContext for allowing Privileged Escalation as true in our secure namespace – podsecurityns.

[admin@kubemaster podsecuritystandards]$ cat privilegedpodstandardns.yml 
apiVersion: v1
kind: Pod
  name: privilegedpod-demo
  namespace: standardns
  - name: privilegedpod-demo
    image: busybox:1.28
    command: [ "sh", "-c", "sleep 1h" ]
      allowPrivilegeEscalation: true
[admin@kubemaster podsecuritystandards]$ kubectl apply -f privilegedpodstandardns.yml 
pod/privilegedpod-demo created
[admin@kubemaster podsecuritystandards]$ kubectl get pods -n standardns
NAME                 READY   STATUS    RESTARTS   AGE
privilegedpod-demo   1/1     Running   0          16s

Hope you enjoyed reading this article. Thank you..