pod stuck with NodeAffinity status // using spot VMs under K8s `1.22.x` and `1.23.x` #112333

gillesdouaire · 2022-09-08T18:12:08Z

The same problem on 1.22.3-gke.700

Originally posted by @maxpain in #98534 (comment)

The text was updated successfully, but these errors were encountered:

k8s-ci-robot · 2022-09-08T18:12:15Z

@gillesdouaire: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

gillesdouaire · 2022-09-08T18:16:07Z

/sig node

We are aware it is supposed to be fixed as of k8s 1.21, but we experienced it in the same context but under newer K8s version. All pods on a given node are stuck with NodeAffinity status, and will remain so until deleted, after which action they will be re-scheduled. The node is ready and otherwise healthy.

k8s version: v1.22.12-gke.1200
spot VMs: enable

pacoxu · 2022-09-13T03:38:14Z

kubelet waits for node lister to sync at least once #94087 fixed Kubernetes scheduler spams cluster with pods in NodeAffinity status #92067 in v1.21.
pkg/kubelet: improve the node informer sync check #99336 improved the performance in v1.22.

BTW, End of Life for 1.22 is 2022-10-28.

How can we reproduce it? I cannot reproduce it by just restarting kubelet in my v1.24 cluster.

gillesdouaire · 2022-09-15T13:56:11Z

@pacoxu In my case, I was able to reproduce the situation on our 1.22 GKE cluster by issueing a few "kube nodes delete... --force" commands on our existing KM nodes hosted on spot VMS / each time, waiting for a new node to respawn, stabilize, then del;ete again, It took 6 delete commands before pods stuck in NodeAffinity state appeared.

As mentioned before, all the pods stuck in NodeAffinity are assigned to the same node.

Right now, I am leaving a few pods in that state, so if you need more details on the actual status of the workloads, let me know.

gillesdouaire · 2022-09-19T13:55:04Z

@pacoxu Using the "kube nodes delete... --force" approach, I was able to reproduce on a Kubernetes cluster running 1.23.10.

Same behaviour: all the pods stuck in NodeAffinity are assigned to the same node and remain unready.

pacoxu · 2022-09-20T02:10:40Z

"kube nodes delete... --force" approach

Does this mean that you delete a node and restart the kubelet several times?

kubectl delete node --force
restart kubelet on Node1

Can you share a sample pod yaml on that node with NodeAffinity state?

gillesdouaire · 2022-09-21T14:49:02Z

Only the first step; once the node is force deleted, kubernetes will have a new node respawn, and then pods will be reassigned correctly OR will fall in the NodeAffinity state.

The pods I had left in NodeAffinity state have been flushed (spot VMs were restarted), so I will need to re-generate a case, will post it here as soon as I have the data from a pod yaml.

gillesdouaire · 2022-09-28T11:51:17Z

Good and/or bad news: I've seen the NodeAffinity status occur once under K8s 1.23, but now I have trouble reproducing.

k8s-triage-robot · 2022-12-27T12:30:29Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

jonpulsifer · 2023-01-20T00:49:44Z

+1, after adding a new preemptible (not spot) node pool to a 1.23.13-gke.900 cluster and scheduling a deployment there I've also noticed this behaviour on the first couple preemptions

/remove-lifecycle stale

pacoxu · 2023-01-28T14:23:41Z

+1, after adding a new preemptible (not spot) node pool to a 1.23.13-gke.900 cluster and scheduling a deployment there I've also noticed this behaviour on the first couple preemptions

@jonpulsifer Do you mind to give some more evidence on how to reproduce the issue?

Good and/or bad news: I've seen the NodeAffinity status occur once under K8s 1.23, but now I have trouble reproducing.

The process is hanging for no stable steps to reproduce the issue in my mind.

SimSimY · 2023-01-31T07:54:31Z

+1
This happens in our GKE cluster to about 5% of the pods that run on preemptive nodes.

1.23.14-gke.401/1.23.12-gke.100

pacoxu · 2023-03-28T08:10:04Z

@SimSimY do you have some more details?

vaibhavkhurana2018 · 2023-06-13T09:59:23Z

This happens with 1.25.8-gke.500 as well.

Steps to reproduce:

Create a cluster with managed SPOT node pool having 1 node.
Create an alpine deployment with 1 replica and a node selector on it.

      nodeSelector: 
        cloud.google.com/gke-spot: "true"

Add a pause or sleep on the above.
Simulate the node-maintenance step on the node the pod is scheduled on. Ref: https://cloud.google.com/compute/docs/instances/simulating-host-maintenance#gcloud
Once the pod comes back up, it will have 1 dangling with NodeAffinity and the other in Running state.

NAME                                  READY   STATUS         RESTARTS   AGE
spot-graceful-test-868b9dd54f-9sh7j   1/1     Running        0          23s
spot-graceful-test-ff77554fd-fv42r    0/1     NodeAffinity   0          3m36s

shaneqld · 2023-06-27T02:23:07Z

I can confirm this also happens in our clusters running 1.25.8-gke.1000. Seems to happen when spot VMs are pre-empted, but only occasionally.

Here's sample events from kubectl describe on a pod stuck in NodeAffinity state:

Events:
  Type     Reason               Age                From                                   Message
  ----     ------               ----               ----                                   -------
  Warning  FailedScheduling     47m (x4 over 47m)  gke.io/optimize-utilization-scheduler  0/17 nodes are available: 1 Insufficient cpu, 1 Insufficient memory, 2 node(s) had untolerated taint {node.kubernetes.io/not-ready: }, 3 node(s) had untolerated taint {...}, 4 node(s) had untolerated taint {...}, 6 node(s) didn't match Pod's node affinity/selector. preemption: 0/17 nodes are available: 15 Preemption is not helpful for scheduling, 2 No preemption victims found for incoming pod.
  Normal   Scheduled            46m                gke.io/optimize-utilization-scheduler  Successfully assigned default/mypod-556d8c7486-xxxxx to mycluster--preemptible-a4fbde29-xxxx
  Warning  FailedMount          46m                kubelet                                MountVolume.SetUp failed for volume "istiod-ca-cert" : failed to sync configmap cache: timed out waiting for the condition
  Normal   Pulling              46m                kubelet                                Pulling image "docker.io/istio/proxyv2:1.xx.x"
  Normal   Pulled               45m                kubelet                                Successfully pulled image "docker.io/istio/proxyv2:1.xx.x" in 545.587643ms (41.513583798s including waiting)
  Normal   Created              45m                kubelet                                Created container istio-init
  Normal   Started              45m                kubelet                                Started container istio-init
  Normal   Pulling              45m                kubelet                                Pulling image "docker.io/istio/proxyv2:1.xx.x"
  Normal   Pulled               45m                kubelet                                Successfully pulled image "docker.io/istio/proxyv2:1.xx.x" in 479.028778ms (13.148100799s including waiting)
  Normal   Created              45m                kubelet                                Created container istio-proxy
  Normal   Started              45m                kubelet                                Started container istio-proxy
  Warning  ExceededGracePeriod  45m                kubelet                                Container runtime did not kill the pod within specified grace period.
  Warning  NodeAffinity         43m                kubelet                                Predicate NodeAffinity failed
  Warning  FailedMount          43m (x6 over 43m)  kubelet                                MountVolume.SetUp failed for volume "istiod-ca-cert" : object "default"/"istio-ca-root-cert" not registered
  Warning  FailedMount          43m (x5 over 43m)  kubelet                                MountVolume.SetUp failed for volume "kube-api-access-8bs7r" : object "default"/"kube-root-ca.crt" not registered

c4talyst · 2023-07-27T08:31:36Z

Same experience on v1.25.10-gke.1400; lots of NodeAffinity pods after spot nodes are preempted.

This was also happening on 1.24.13-gke.2500; and we upgraded to attempt to reduce the noise.

Google says this is 'fixed' from 1.25.7-gke.1000 or later https://cloud.google.com/kubernetes-engine/docs/release-notes#April_14_2023 (but it's not)

Sliced screenshot output of the equivalent of kubectl get po,no

austinpray-mixpanel · 2023-11-22T14:49:08Z

Still able to reproduce on control plane 1.25.12-gke.500 / preemptible nodepool 1.25.12-gke.500

NodeAffinity status pods look like:

status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: null
    message: 'Pod has become Healthy in NEG "Key{\"(snip)\",
      zone: \"us-central1-b\"}" attached to BackendService "Key{\"(snip)\"}".
      Marking condition "cloud.google.com/load-balancer-neg-ready" to True.'
    reason: LoadBalancerNegReady
    status: "True"
    type: cloud.google.com/load-balancer-neg-ready
  message: Pod Predicate NodeAffinity failed
  phase: Failed
  reason: NodeAffinity

bobbypage · 2023-11-22T21:37:40Z

Are you seeing that all the pods stuck in "the node affinity" status have phase: Failed? If so, those pods are terminal and will not be started on any new nodes. Since those pods are terminal, they shouldn't affect anything -- the pod gc garbage collector will eventually clean those up or they can be manually cleaned up.

GKE has a fix for this issue to automatically clean up terminal pods on VM preemption that is available from control plane version 1.27.2-gke.1800+. Please try that out for the long term fix. Since this is a GKE specific issue, please reach out to GKE support if you continue to have issues. Thanks!

k8s-triage-robot · 2024-02-20T21:41:33Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2024-03-21T22:20:50Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot · 2024-04-20T23:00:54Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-ci-robot · 2024-04-20T23:00:59Z

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen

Mark this issue as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Sep 8, 2022

k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Sep 8, 2022

gillesdouaire changed the title ~~The same problem on 1.22.3-gke.700~~ NodeAffinity pod on Spot VMs under 1.22.3-gke.700 Sep 8, 2022

k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Sep 8, 2022

gillesdouaire changed the title ~~NodeAffinity pod on Spot VMs under 1.22.3-gke.700~~ pod stuck with NodeAffinity status using VMs under 1.22.3-gke.700 Sep 8, 2022

gillesdouaire changed the title ~~pod stuck with NodeAffinity status using VMs under 1.22.3-gke.700~~ pod stuck with NodeAffinity status // using spot VMs under K8s 1.22.x Sep 8, 2022

gillesdouaire changed the title ~~pod stuck with NodeAffinity status // using spot VMs under K8s 1.22.x~~ pod stuck with NodeAffinity status // using spot VMs under K8s 1.22.x and 1.23.x Sep 19, 2022

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 27, 2022

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 20, 2023

vaibhavkhurana2018 mentioned this issue Jun 13, 2023

NodeAffinity issue for preemptible nodes #98534

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 20, 2024

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 21, 2024

k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale Apr 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pod stuck with NodeAffinity status // using spot VMs under K8s `1.22.x` and `1.23.x` #112333

pod stuck with NodeAffinity status // using spot VMs under K8s `1.22.x` and `1.23.x` #112333

gillesdouaire commented Sep 8, 2022

k8s-ci-robot commented Sep 8, 2022

gillesdouaire commented Sep 8, 2022 •

edited

pacoxu commented Sep 13, 2022

gillesdouaire commented Sep 15, 2022 •

edited

gillesdouaire commented Sep 19, 2022

pacoxu commented Sep 20, 2022

gillesdouaire commented Sep 21, 2022 •

edited

gillesdouaire commented Sep 28, 2022

k8s-triage-robot commented Dec 27, 2022

jonpulsifer commented Jan 20, 2023

pacoxu commented Jan 28, 2023 •

edited

SimSimY commented Jan 31, 2023 •

edited

pacoxu commented Mar 28, 2023

vaibhavkhurana2018 commented Jun 13, 2023 •

edited

shaneqld commented Jun 27, 2023

c4talyst commented Jul 27, 2023

austinpray-mixpanel commented Nov 22, 2023

bobbypage commented Nov 22, 2023

k8s-triage-robot commented Feb 20, 2024

k8s-triage-robot commented Mar 21, 2024

k8s-triage-robot commented Apr 20, 2024

k8s-ci-robot commented Apr 20, 2024

pod stuck with NodeAffinity status // using spot VMs under K8s 1.22.x and 1.23.x #112333

pod stuck with NodeAffinity status // using spot VMs under K8s 1.22.x and 1.23.x #112333

Comments

gillesdouaire commented Sep 8, 2022

k8s-ci-robot commented Sep 8, 2022

gillesdouaire commented Sep 8, 2022 • edited

pacoxu commented Sep 13, 2022

gillesdouaire commented Sep 15, 2022 • edited

gillesdouaire commented Sep 19, 2022

pacoxu commented Sep 20, 2022

gillesdouaire commented Sep 21, 2022 • edited

gillesdouaire commented Sep 28, 2022

k8s-triage-robot commented Dec 27, 2022

jonpulsifer commented Jan 20, 2023

pacoxu commented Jan 28, 2023 • edited

SimSimY commented Jan 31, 2023 • edited

pacoxu commented Mar 28, 2023

vaibhavkhurana2018 commented Jun 13, 2023 • edited

shaneqld commented Jun 27, 2023

c4talyst commented Jul 27, 2023

austinpray-mixpanel commented Nov 22, 2023

bobbypage commented Nov 22, 2023

k8s-triage-robot commented Feb 20, 2024

k8s-triage-robot commented Mar 21, 2024

k8s-triage-robot commented Apr 20, 2024

k8s-ci-robot commented Apr 20, 2024

pod stuck with NodeAffinity status // using spot VMs under K8s `1.22.x` and `1.23.x` #112333

pod stuck with NodeAffinity status // using spot VMs under K8s `1.22.x` and `1.23.x` #112333

gillesdouaire commented Sep 8, 2022 •

edited

gillesdouaire commented Sep 15, 2022 •

edited

gillesdouaire commented Sep 21, 2022 •

edited

pacoxu commented Jan 28, 2023 •

edited

SimSimY commented Jan 31, 2023 •

edited

vaibhavkhurana2018 commented Jun 13, 2023 •

edited