-
Notifications
You must be signed in to change notification settings - Fork 40.6k
QoS tiers #147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Would QoS tiers specified via labeling or orthogonal to labeling (assuming orthogonal given presentations, just want to make sure)? Do you see the need for feedback between a minion about current resource load and the scheduler, and if so, is this a backchannel (a formal event bus) concept that is part of Kubernetes or encapsulated behind the scheduler interface? |
Caveat, still coming up to speed on everything, but that won't stop me from QoS and administrative or config information should never be labels. On Fri, Jun 20, 2014 at 7:50 AM, Clayton Coleman notifications@github.com
|
@smarterclayton : I agree with @thockin . QoS specifications need to be first-class fields in our API, together with resource specifications (#168). For one possible example of what this could look like, see lmctfy: |
How does QoS tiers relate to resource requests? |
At the node level, QoS would be specified together with resource requests to the execution layer, such as an exec driver over lmctfy (which might eventually be layered on libcontainer). At the scheduling level, a variety of policies are possible. We could start with the naive approach of not considering QoS in scheduling. Eventually we could support some kind of overcommitment for lower QoS tiers. |
Another example where we'd like differentiated QoS: Pods/containers with specified memory limits should receive preferential treatment compared to "best effort" pods/containers that don't, such as by setting their OOM adjust scores. /cc @dchen1107 |
Current thinking for basic tiers:
We could also use core allocation, oom score adjustment, and other mechanisms to strengthen this basic protection. |
@bgrant0607 @derekwaynecarr would there be quota at each tier? |
@erictune Yes. |
We will need to name the tiers/priorities so they can be referred to in the quota file. |
Is this post 1.0? Want to know if I need to reserve time to handle quota related fallout? |
Yes, post-1.0. |
--cgroup_parent is in: moby/moby#11428 |
Shall we go ahead with changes to differentiate between jobs with and without limits? Essentially introducing tier2 guarantees for jobs with unspecified limits? |
+1 for having a limit for all jobs without limits. We need to reserve On Thu, Mar 19, 2015 at 5:00 PM, Rohit Jnagal notifications@github.com
|
Discussed offline. The minimal protection for sys daemons we have now is good enough for v1.0. We'll take up the rest post 1.0. |
Sounds good. Will wait for the proposal. |
Proposal is in #11713. |
Note that this issue is about performance/resource QoS, NOT availability/reliability/disruption QoS. They are related, but not necessarily coupled. |
@NELCY @AnanyaKumar However, I don't think it is necessary to introduce tiers/priority as a first-class to handle the @NELCY mentions. You could also handle that scenario using vertical pod autoscaling. With vertical pod autoscaling, you initially create a pod with some request and limit, and then a control loop periodically looks at each pod which is it supposed to manage, and decides whether to increase, decrease, or leave alone the request and limit of the pod. A typical vertical autoscaling policy would be to look at historical usage, take, say, the 95%ile, and add some padding to that, and call that the new request and limit. The case @NELCY mentions could be implemented with the following autoscaling rule:
I think it is better to keep the kubelet's and scheduler's view of resources as simple as possible, and to layer more complex policies as control systems on top of those. This approach has several advantages:
|
@AnanyaKumar @erictune I like the idea of control loop on top, but it doesn't always give us a completely elastic behavior. in my previous experiences, with control loop it was always a debate about how frequently you run the control loop especially for compressible resources. But I agree it has several advantages and gives us greater flexibility. For now, I am happy with support for Burstable pods. Thanks. |
Adding v1.1 remote API.
Superseded by more recent issues. |
V1.7.0 beta.2 patchset
Update kubectl release notes with changes in behavior in 1.5
Kubernetes v1.12.7. Problem still exists. Pods are randomly dying. I call it "Happy New Year Tree" effect in Kubernetes cluster. Which recent issue I shoud look? |
This is an empty configuration file to activate snippet-bot: https://github.com/googleapis/repo-automation-bots/tree/master/packages/snippet-bot
We have described QoS tiers in several presentations and the necessary underlying mechanisms are supported by lmctfy:
http://www.slideshare.net/dotCloud/eric-brewer-dockercon-keynote
http://www.slideshare.net/vmarmol/containers-google
http://www.linuxplumbersconf.org/2011/ocw/system/presentations/705/original/Control_Loops_In_Userspace.pdf
https://github.com/google/lmctfy
Borg: https://research.google.com/pubs/pub43438.html
We should expose QoS tiers through the Kubernetes pod API.
The text was updated successfully, but these errors were encountered: