Can not create pod when GKE update node version

Dounpct
3 min readJan 14, 2023

--

Yesterday I got infomation from develop team that some application on GKE cluster can’t work well. After I have check for a while I found that so many pod have stuck with status “containercreating” and GKE cluster have upgrading. So I try to delete pod but It stuck in status “Terminating”

In our cluster we have 3 node. and many application that be deployed such as ArgoCD, Nats, EcpRouter, KeyCloaks, Prometheus, many exporter for Prometheus and so on. Some application work well and some application don’t work well

we wait about 1 hour for complete upgrade node. But pod in status Creating and Terminating still stuck.

After investigate a little more time. new 3 node have ready but pod that have problem be in only 3rd node.

I try to force delete pod with

kubectl delete pod/nats-2   --grace-period=0 --force  -n nats-prod

so it can delete and create new pod in 3rd and still stuck in status “containercreating”

finally I decide to create 4th node and cordon 3rd node. After cordon 3rd node I wait a long time because pod that in status “containercreating” can’t move to other node. I have force delete all pod in status “containercreating” again and this time look good. new pod create in new node and every pod run in status runing.

after everything work well I delete 3rd node that I think it have problem

I hope this event may help some one.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — —

Credit : TrueDigitalGroup

— — — — — — — — — — — — — — — — — — — — — — — — — — — — —

--

--

Dounpct
Dounpct

Written by Dounpct

I work for TrueDigitalGroup in DevOps x Automation Team

No responses yet