New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kube-proxy in LocalModeNodeCIDR mode may cache stale Node.PodCIDR if the Node object is recreated #111321
Comments
/cc |
We are watching the node object since we've implemented topology hint, so it would be easier to adapt the code to react on these changes |
/triage accepted |
for iptables kubernetes/pkg/proxy/iptables/proxier.go Lines 617 to 676 in 3c034ba
for ipvs kubernetes/pkg/proxy/ipvs/proxier.go Lines 940 to 996 in 3c034ba
|
/assign |
/reopen The fix in #111344 partially solved this problem, but there is still a race in the code. So, when kube-proxy starts it calls that calls kubernetes/cmd/kube-proxy/app/server.go Line 569 in 7935006
that if kubernetes/cmd/kube-proxy/app/server_others.go Lines 80 to 87 in 7935006
However, the value obtained here is not the value used by the kubernetes/cmd/kube-proxy/app/server.go Lines 325 to 331 in 7935006
kubernetes/cmd/kube-proxy/app/server.go Lines 754 to 759 in 7935006
If something changes between the node informer is initiated and the starting code gets the PodCIDR, it will get unnoticed and kube-proxy will be operation with an outdated value, causing issue |
@aojea: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What happened?
If
LocalModeNodeCIDR
is true, kube-proxy reads theNode.PodCIDR
value and uses it to determine whether or not traffic is local to the Node.In certain cases (e.g. when kube-proxy runs before the Node information has been completely updated), it is possible for kube-proxy to read and store a stale
PodCIDR
value. This will cause kube-proxy to route traffic incorrectly as the local traffic determination is no longer correct.Note: PodCIDR is immutable once set but it is possible to delete and recreate the Node object to get into this state. (yes, this can happen)
/sig network
What did you expect to happen?
kube-proxy should detect this state and reconfigure/restart with the newly updated
PodCIDR
.How can we reproduce it (as minimally and precisely as possible)?
Recreate the Node object with a different PodCIDR after kube-proxy has started on the Node.
Kubernetes version
Impacts all current versions that support
LocalModeNodeCIDR
Cloud provider
The text was updated successfully, but these errors were encountered: