Thinking that you managed to escape security? Guess again.
In this post, I will focus on containers with Azure Kubernetes in mind but these examples apply to all the worlds out there -EKS, GKE, and on-premises.
Containers gained popularity because they made publishing and updating applications a straightforward process. You can go from build to release very fast with little overhead, whether it’s the cloud or on-premises, but there’s still more to do.
The adoption of containers in recent years has skyrocketed in such a way that we see enterprises in banking, public sector, and health looking into or are already deploying containers. The main difference here would be that these companies have a more extensive list of security requirements and they need to be applied to everything. Containers included.
Orchestrator security
This part will focus on orchestrator security. What you can do to harden your orchestrators from malicious intent.
Keep your Kubernetes cluster as updated as possible.
This might sound obvious for some of you but it’s better to say it. Kubernetes has a support cycle of three major versions. Meaning that they will only support only the latest three major versions, anything less and you’re out of support. They also keep release branches for three minor versions as well. Keep that in mind when you’re doing cluster maintenance.
Patch your master and data nodes
Say what? Yup, your read that correctly, if you’re using a managed offering like Azure Kubernetes Service or Elastic Kubernetes Service then you have to patch your worker nodes. If you’re running KOTS, AKSEngine or any other flavor of Kubernetes in VMs (be it cloud or on-premises) then you have to do patch management for the master nodes as well.
A solution for this problem is to install a daemonset called Kured (Kubernetes Reboot Daemon) which performs automatic node reboots when they require it. When the package manager sees that updates are available for the installed packages then it adds a file in /var/run/ called reboot-required and Kured looks for that. Once it sees that the nodes require a reboot it will cordon and drain the nodes and uncordon them after.
Use namespaces
Using namespaces doesn’t add a big layer of protection but they surely add a layer of segregation between the pods. By using namespaces and not throwing everything in the default namespace adds a bit towards security and reduced kubectl get pods clutter.
Enable RBAC and give permissions as strictly as possible
If you already have Kubernetes cluster deployed then tough luck; You have to dismantle and redeploy them. You wanted bleeding edge? There you go ? Jokes aside, having RBAC enabled is a major win towards security because you’re not giving full cluster-admin access on the Kubernetes cluster. This is a good time to polish your ARM or Terraform skills and start treating your clusters as cattle not pets.
# Create a role
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: default
name: pod-reader
rules:
- apiGroups: [""] # "" indicates the core API group
resources: ["pods"]
verbs: ["get", "watch", "list"]
# Apply RBAC to a user
apiVersion: rbac.authorization.k8s.io/v1
# This role binding allows "John" to read pods in the "default" namespace.
# You need to already have a Role named "pod-reader" in that namespace.
kind: RoleBinding
metadata:
name: read-pods
namespace: default
subjects:
# You can specify more than one "subject"
- kind: User
name: John # "name" is case sensitive
apiGroup: rbac.authorization.k8s.io
roleRef:
# "roleRef" specifies the binding to a Role / ClusterRole
kind: Role #this must be Role or ClusterRole
name: pod-reader # this must match the name of the Role or ClusterRole you wish to bind to
apiGroup: rbac.authorization.k8s.io
Leverage AAD integration with AKS
This recommendation goes hand in hand with the RBAC enabled clusters. This feature works in Azure Kubernetes clusters and again this is an on-create feature. By integrating your AKS cluster with Azure Active Directory, you will have very clear control of who is accessing the resources and you also the get added benefit of AAD protection, MFA support and everything else that Microsoft adds in it ?
Get rid of Helm 2 when Helm 3 releases
Yes, yes I know that Helm 3 is not here yet but listen up, once it goes GA, migrate as fast as possible towards it and don’t look behind. Helm 2 is not a bad package manager but that damned tiller pod should die in a fire. As you may know, tiller never gets initiated with TLS enabled or any type of security. You’re getting a prompt that you should that but nobody does that. Tiller is basically the Helm Server. With Helm 3, tiller goes in the corner to die off. Good riddance.
Use Network Policy for your pods
By default, all deployed pods can talk with each other, it doesn’t matter if they are segregated in different namespaces or not. This is the beauty of container orchestrators as they give you all the infrastructure you need and you just publish your code. The downside is that if a container gets a malicious infiltration then all other pods are left vulnerable. This is the same concept as in VM security, you just need only one machine to be vulnerable because after that you can do as much lateral movement as you want.
For this to work, you need to have CNI initiated. In Azure for AKS, you need to deploy the cluster with advanced networking enabled or in the CLI add the flag –network-plugin azure
Take a look at AppArmor and Secure Computing (SecComp)
Security features like the ones I referenced above and AppArmor, SecComp are additional levels of security that transform the Kubernetes environments in hostile environments for attackers. In AKS AppArmor is enabled by default but you can create profiles that restrict the pods even more. With an AppArmor profile, you can block actions as read, write, execute or system mount which limits, even more, the possibilities of a malicious actor.
If you want to leverage App Armor profiles then SSH to your AKS node and create a file named deny-write.profile and paste the following code
The pod will instantiate correctly but if you want to do something funky with file write, then it’s not going to work.
Secure Computing or SECComp is another security module that exists in the AKS Cluster nodes and enabled by default. With SECComp you can specify filters for what a POD cannot do and work from there. An example would be like the one from below
Create a file named prevent-chmod in /var/lib/kubelet/seccomp/prevent-chmod . It will be loaded immediately
That should be enough for orchestrator security, next on the list are containers
Container security
This part will focus on container security. What you should do to keep your containers as secure as possible without causing your application go die off a horrible death.
Use Distroless or use lightweight images e.g., alpine
If you’ve never heard of Distroless, it’s a container image built by Google that doesn’t contain an operating system. All the system-specific programs like package managers, shell, networking stuff and so on do not exist in the Distroless image. Using an image like this reduces the security impact tenfold. You have a lower attack surface, fewer image vulnerabilities, you gain true immutability and so on, I suggest you give them a try ?
Remove package managers and network utilities
If Distroless is too hardcore for you then start with alpine images and remove the package managers, network utilities and shell access. Just by doing this, you’re going to get a more secure container. Name of the game? Reduce the attack surface. Go.
Remove files system modification utilities (chmod, chown)
It’s not enough to gain access to the container. You need to actually run some shell scripts and execute other types of commands. If the attack doesn’t have chmod or chown, their life is hell. If you combine this with the previous recommendation then you’re golden.
Scan your containers images for vulnerabilities (Aqua Security, Twistlock, etc.)
This is a no-brainer in production systems. If you’re scanning VMs for malicious intent then you should be scanning your containers as well. Containers, as well as VMs, are meant to run applications. Containers are by fact smaller in size when compared to a VM but that doesn’t mean that you should just trust your gut feeling and go with it in production. Get a container scanning solution and implement it in your environment. The more information you have about your containers, the better.
Keep an eye out for ephemeral containers in Kubernetes – bind a container to an existing pod for debugging purposes
This is an awesome feature that’s come out in alpha with Kubernetes 1.16. Can’t wait for it to be available in Azure ?
Basically ephemeral containers allow the user to attach a container to an existing pod in order to debug it. If you use ephemeral containers in tandem with Distroless containers then you’re in a win-win scenario. You gain the benefit of security while retaining the debugging capabilities. albeit in other containers.
By the way. A pod is not equal to a container. A pod can have multiple containers under it but it’s not a frequent practice.
You don’t need root access inside the container. By default, all containers run as root and this introduces a few security concerns because file system permissions do not apply to the root user and cherry on top the root user and read and write file on the file system, change stuff in the container, creating reverse shells and things like that.
In Kubernetes, you can enforce containers to run in user-mode by using the pod & container security context.
By default, the filesystem of a given container is read/write enabled but should why should we be writing files in a container? Cache files or temporary files are ok but you lose everything when the container dies so, to be honest, you shouldn’t need to write anything or import anything in a container. Dev/Test doesn’t apply here.
Set the FileSystem in read-only mode and be safer. If you need to write temp files fine, mount a /tmp folder. Example below:
If you want a TL;DR, here it is. Containers require security controls around them and they need to be built with security in mind. There’s no easy way around it, we need to be security conscious otherwise bad things will happen.
If you’re going to implement any of the recommendations from above, please take in mind that everything needs to be tested before deploying in production.
That’s kinds it. I hope all the recommendations from above are helpful and as always, have a good one!
Node image retirement is not an emergency if you treat it as a predictable operating model. How to migrate images without chaos, what to test, and how to govern the decision.
APIM isn't just a gateway. It's a governance layer that enforces consistency across AKS, Container Apps, and other platforms. When to use it and when to keep things simple.
Network policy is not theoretical; Cilium and eBPF make it practical. Learn when segmentation actually matters, how to observe before you enforce, and why most teams get it wrong at first.