Lessandro

A website containing something I’ve been working on

20 Nov 2021

Debugging Kubernetes cluster pt.2

Debugging Kubernetes cluster — part1

Requirements

a. Install Krew

b. Install resource-capacity plugin

kubectl krew install resource-capacity

c. Install lineage plugin

kubectl krew install lineage

d. Install kail.

e. Install blame plugin

kubectl krew install blame

Troubleshooting commands

a. kubectl get events — field-selector type=Warning — all-namespaces

Image alt

This command will show all warnings events in the cluster.

b. kubectl get nodes -o wide — label-columns topology.kubernetes.io/zone

Image alt

This command will show any nodes that are not in a “ready” state, the version of kubelet or a different container runtime. Also, the parameter zone can show if any of your node problems are related to a specific zone.

c. kubectl resource-capacity — pods — util — sort cpu.util

Image alt

Here we can see which pods are using more resources than they should be.

Obs.: This plugin requires that you have the metrics-server installed.

d. kubectl get all — show-labels

Image alt

Image alt

This command will show the pod-template-hash which we can see if it matches between our deployment and pod.

e. kubectl lineage pod ${POD}

kubectl lineage pod banana-app

Image alt

The plugin lineage can show what created a resource in Kubernetes and what depends on it.

f. kail -n kube-system — since 1m

Image alt

This command will show all logs during the last minutes for the specific namespace, kube-system

g. kubectl get service -o wide

Image alt

Here we can see the services in the cluster, which we can check the type of service it is, if there’s an external IP address or load balancer, and what labels are used.

h. kubectl get endpointslices -o wide

This will show us each of our services in the namespace and which pod IP addresses are associated with that service.

Image alt

i. kubectl port-forward deploy $DEPLOYMENT $LOCAL_PORT:$POD_PORT

kubectl port-forward deploy $DEPLOYMENT $LOCAL_PORT:$POD_PORT

This command will let us bypass the load balancer or ingress controller to see if we can send traffic directly to one of the pods in the deployment.

j. kubectl blame pod $PODblame**

kubectl blame pod banana-app

Image alt

You can use it to see what parts of a pod manifest have changed and who or what changed them.

Hopefully, these commands can help you during the Kubernetes troubleshooting process.