Kubernetes API: Allocatable Node Resources?

Gerrit Riessen
6 min readFeb 3, 2019

Obtaining an overview of allocatable resources across a cluster using the Kubernetes API. Kubectl provides this when describing nodes, so it must be possible.

Photo by Paul Frenzel on Unsplash

Resources are central to the workings of Kubernetes. At the most fundamental level, memory and CPU are the resources of interest. These partly determine how Kubernetes spreads workload across the cluster.

Clusters have a number of nodes, each node has a allocatable amount of CPU and memory resources. The sum of this allocatable resource is the absolute upper limit of the available computing power that a cluster has.

Pods are the basic unit of workload. Each pod can provide Kubernetes with a resource range consisting of a baseline (i.e., requests) and a maximum (i.e., limits). Kubernetes distribute pods on nodes according to their resource requirements. Pods are started on nodes that have enough available resources to cover the maximum requested by the Pod.

This means, that if I know the allocatable resources available on a cluster, then I know whether a new pod will be started or not, i.e., are the enough resources available for starting a new pod.

The Why.

I am interested in dynamically starting pods based on application specific requirements. Hence I would like to get an overview of what resources are available within the cluster. My aim is to dynamically expand and contract my application depending on application load.

At a minimum, what I am interested in is the available allocatable resources on each node. This is something that kubectl describe nodes provides right at the end of its output:

Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 1731m (89%) 2166m (112%)
memory 3954064Ki (68%) 6679952Ki (115%)
ephemeral-storage 0 (0%) 0 (0%)

I am not interested in current usage values, i.e., what kubectl top provides. What I need to know is whether I can spawn a new pod or not. This is related to allocability and not usage. Also I can't use kubectl since I going to need this information within the cluster, that is where I am going to spawn new pods. So I have to find a way of doing this via the API.

Also I can’t just blindly start new pods since pods remain in pending state until resources become available. Having many pods in pending state makes the cluster unstable and unpredictable since it’s unclear which pending pod will be started once resources become available. Hence before starting a new pod, I want to first know whether there are resources available for it.

The How.

Ideally there would be a direct API call that gave me what I wanted. So started doing some ddg’ing but couldn’t find anything. It seemed fairly clear that there wasn’t a direct and simple API solution.

Eventually I stumbled onto a Stack Overflow answer to another kubectl related question. Basically the answer was: “I looked at the source code and noticed it did this ….” Monkey see, monkey do: I decided to do the same.

Sometimes documentation might be good, but the truth lies in the code. In this case, the truth was at GitHub.

Starting point was the part of the code where the output I was looking for was generated. From there, I worked backwards and it became clear what was happening: kubectl retrieves all relevant pods on a node and sums up their resource definitions. This summation happens in the getPodsTotalRequestsAndLimits function.

Which pods on a node are the relevant ones was defined by this line, basically, every pod that isn’t failed or succeeded (succeeded pods are those that have exited with a zero exit value).

One other bit of magic that the Go code does, is the conversion of the CPU and memory units automagically. This is done by the quantity package, I knew I needed something similar.

The Code.

Now that I knew how to do it, it was a matter of doing it. I am working in Python, so first step was to find a Python package for doing the unit conversions.

This Stack Overflow question talked about Pint. Having a look, I was impressed by the documentation, the code is maintained and seems to be active. But the best bit is that it also allows for the definition of domain specific units.

Brilliant! Since Kubernetes resource units were unknown to Pint, I defined the following definitions (CPU are in units of 1000 and memory in 1024):

# kubernetes memory units
kmemunits = 1 = [kmemunits]
Ki = 1024 * kmemunits
Mi = Ki^2
Gi = Ki^3
Ti = Ki^4
Pi = Ki^5
Ei = Ki^6
# kubernetes cpu units
kcpuunits = 1 = [kcpuunits]
m = 1/1000 * kcpuunits
k = 1000 * kcpuunits
M = k^2
G = k^3
T = k^4
P = k^5
E = k^6

So I have my unit handling, now for the Kubernetes part.

As Kubernetes API wrapper, I am using the official Python package. Documentation is not that great, since it’s generated but at least the entire API is documented. That is why I was fairly certain there wasn’t a direct API call for what I was looking for.

So basic plan is, obtain a list of nodes, obtain all their pods, loop through the containers of individual pods and obtain all resource requests and limits. So this means there are three loops: over the nodes of the cluster, over the pods of a node, and over the containers of a pod.

In combination with the pod field selector, this then became:

for node in k8s_api.list_node().items:
node_name = node.metadata.name
fs = ( "status.phase!=Succeeded,status.phase!=Failed,"+
"spec.nodeName=" + node_name)
pods = k8s_api.list_pod_for_all_namespaces(field_separator=fs) for pod in pods:
for container in pod.spec.containers:
## obtain resources

What is not shown is that all resources are appended to separate arrays for later summation.

A nice thing about the Pint library is that it does everything using quantities, i.e., sum, divide, multiple, subtraction etc. So it really became a matter of storing all the resource type values in separate arrays and then doing a sum(array) to get the total for each resource type.

For example, to obtain all CPU requests for a pod then became:

cpureqs = []
Q_ = pint.UnitRegistry().Quantity
for pod in pods:
for container in pod.spec.containers:
cpureqs.append(Q_(container.resources.requests["cpu"]))
pod_cpu_requests = sum(cpureqs)

So now I had the basis, it was only a matter of extending this for all resources, both baseline and maximum, and group those results together.

Nothing is perfect and so I had to do some iterations to get the code working but eventually it gave me the same result as kubectl describe. Some things that I ran into: containers that don’t define resources, secondly using the wrong resource information of a node and thirdly pagination is not desirable when using an API.

The Bugs.

First bug: I forgot to handle containers that don’t define resources or only partially. This might be bad practice, but it’s possible. Missing values are assumed to be zero. Missing values caused a KeyError and to avoid that, Python has the defaultdict which provides a default value for keys that aren't defined:

reqs = defaultdict(lambda: 0, container.resources.requests or {})
reqs["cpu"] # either the value or zero, but no KeyError

Second bug: Kubernetes nodes define two resource sets: capacity and allocatable. It's important to take allocatable for doing all calculations, allocatable is capacity minus the Kubernetes overhead.

Third bug: the API only returns a limited number of pods. But I didn’t want to paginate over the API, so I set the limit to a value above the number of pods allocatable on the node. This ensures I get all pods with one call to the API:

allocatable    = node.status.allocatable
max_pods = int(int(allocatable["pods"]) * 1.5)
pods = core_v1.list_pod_for_all_namespaces(limit=max_pods ...

For brevity, the entire Python code is available at GitHub. So now I’m one step closer to my goal of automatically scaling my application. Whether I should be doing this with Kubernetes autoscaler is a different question!

The Learning.

Sometimes it’s hard to know when to write your own code and when a little bit more ddg’ing provides an off-the-shelf solution. I was hesitant to write this code since I was sure that the API must provide something for this. Looking at the kubectl source code convinced me that wasn’t the case.

Unit conversion was something that I definitely wasn’t going to code myself and I was extremely happy to have found the Pint package. It made this task extremely concise.

--

--