New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubelet: Pod level Resource Management #26751
Kubelet: Pod level Resource Management #26751
Conversation
Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test". This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry. Otherwise, if this message is too spammy, please complain to ixdy. |
2 similar comments
Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test". This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry. Otherwise, if this message is too spammy, please complain to ixdy. |
Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test". This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry. Otherwise, if this message is too spammy, please complain to ixdy. |
ok to test |
cc @kubernetes/sig-node |
|
||
We have the following resource parameters for the `Bu` cgroup. | ||
|
||
'/Bu/cpu.shares = summation of all cpu requests <br> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: ' typo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vishh @dchen1107 - when we discussed this in our F2F, we had also discussed Bu
would be a child of G
? any reason for the change or am I recalling incorrectly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NM, I remember we discussed both flat and hierarchical proposals, and this is presenting the flat.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We decided to go with a flat hierarchy at the end.
It would be good if this proposal can give an outline of the entire system cgroup hierarchy. For example, it should include something more akin to: I think we also need a description on how we would at minimum enforce CPU shares for system-reserved and kube-reserved. Otherwise, they will get the default shares value of 1024 which may or may not reflect desired outcome. |
|
||
/BE/cpu.shares = 2 <br> | ||
/BE/cpu.quota= not set | ||
/BE/m.limit_in_bytes = Allocatable - summation of memory requests/limits of guaranteed pods <br> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allocatable - summation of memory requests for G and Bu pods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this will have an interesting side effect that we should clarify.
The kubelet is more likely to OOM kill existing BE pods as more G and Bu pods are scheduled even if the G and Bu pods are using less than their request since we will have to dynamically reduce the size of BE m.limit_in_bytes.
The existing eviction behavior will not cause those BE pods to get moved since they will not be able to hog memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. As G
pods are added, BE
pods are more susceptible to being OOM Killed. This policy attempts to provide best performance for G
pods. In the future, we can let users configure the level of over-commit.
@derekwaynecarr Thanks a lot for your comments. Yeah I too feel that a a general cgroup hierarchy outline similar to one in the kubelet-systemd propasal would be a really good idea. |
Just realized I never followed up on this... |
Yes I would be adding a section highlighting the change. Something along the lines of this:
|
e44bbd9
to
ad8e2c2
Compare
@davidopp PTAL at this patch - ad8e2c2 @derekwaynecarr This patch is ready to go, unless you have any more comments. |
@davidopp @derekwaynecarr ping. |
Don't wait on me -- based on #26751 (comment) it sounds like you've addressed my concern, and I won't have time to re-read the doc soon. |
Signed-off-by: Buddha Prakash <buddhap@google.com>
ad8e2c2
to
e425c6b
Compare
Reattaching LGTM after running hack/update-munge-docs.sh to modify the image warning urls at the top. |
GCE e2e build/test passed for commit e425c6b. |
Automatic merge from submit-queue |
This proposal outlines our plan for improving resource management in Kubernetes by having a Cgroup hierarchy with QoS and Pod level Cgroups.
This is the initial proposal which broadly covers our goals and how we plan to achieve it. At this point we would really appreciate feedback from the community.
This is tied to the upstream issue #5671. So i would request
@vishh @dchen1107 @bgrant0607 @jdef PTAL.