AWS: We should consider updating/reusing cluster-autoscaler to support AWS #11935

justinsb · 2015-07-28T17:51:23Z

We have an autoscaling group for the minions; we should consider enabling auto-scaling based e.g. on CPU or a custom metric we publish.

erictune · 2015-07-30T14:30:51Z

A group of us have been discussing node autoscaling this week, including @bgrant0607 @vmarmol @davidopp @jszczepkowski @piosz @gmarek @mwielgus @wojtek-t (probably forgetting some people)

erictune · 2015-07-30T14:38:31Z

One thing we talked about was maybe layering the system like this:

Pod horizontal autoscaler scales up pod count using CPU as a signal, and maybe later custom metrics, such as http request rate, http latency, etc.
Node (horizontal) autoscaler adds nodes when pods are pending due to the scheduler not being able to find a place in the cluster for the pod (failed PodFitsResources check in scheduler). This assumes that pods set reasonable CPU and memory limits.
So, the Node autoscaler wouldn't directly look at CPU, but indirectly hears about it due to pods being pending.

justinsb · 2015-07-30T14:49:56Z

That makes a lot of sense to me. I would love to be involved in any discussions.

AWS auto-scaling-groups (and I believe Google MIGs via autoscalers) allow for a quick-and-dirty version of this. Your approach is infinitely better, though I suspect will take a little longer!

The fact that the scheduler will avoid overloading the cluster makes auto-scaling externally much less useful, so we would be in custom metric territory. Even then, I think that having the master node manage the instances will be a much better experience.

Maybe we could promote this interface out of pkg/cloudprovider/aws (currently used only for e2e tests):
https://github.com/GoogleCloudPlatform/kubernetes/blob/8d5a6b063c68b50e9e2e481c04c4cfec4fa57bde/pkg/cloudprovider/aws/aws.go#L147-L154

ecowan · 2015-11-19T20:40:06Z

Hi everyone, I too am very interested in seeing progress on this front. I would really appreciate it if someone could point me to any resources / pull requests that have been done. Thanks!

satheessh · 2016-01-08T18:11:56Z

+1

rafaljanicki · 2016-01-12T10:10:02Z

+1

valery-zhurbenko · 2016-02-15T10:29:11Z

+1

piosz · 2016-02-15T11:04:07Z

If anyone from would like to integrate Kubernetes with AWS autoscaler I'm happy to share our experience with integrating Kubernetes with GCE autoscaler.

cc @fgrzadkowski @mwielgus

sstarcher · 2016-02-16T14:41:49Z

@piosz I would be interested in hearing your experience with Kubernetes GCE autoscaler and I may be interested in helping with this feature.

miguelfrde · 2016-02-17T18:58:08Z

@piosz I would be interested on hearing about your experience and helping with this feature as well.

jimmycuadra · 2016-03-02T13:02:10Z

@piosz Yes, please! Very interested in this.

dengshuan · 2016-03-24T07:28:04Z

Any schedules about this feature? Or is there any more detailed discussion about this?

mwielgus · 2016-03-24T07:49:06Z

For 1.3 we have a plan to revisit cluster autoscaling in Kubernetes and make it more user-friendly. At this moment we are discussing our 1.3 priorities and project assignments internally at Google. We will let you once we reach some agreement regarding the possible scope of the improvement that can be delivered by Google and integration plans for other cloud providers (we will definitely need community help there).

sstarcher · 2016-03-24T12:32:08Z

Our current AWS scaling strategy for Kubernetes currently has 3 parts

Scale add instances on Pending pods
Remove instances not running pods
A changed to the scheduler to pack our load instead of spreading our load

apobbati · 2016-08-24T05:13:32Z

@pbitty Have you made any progress on this issue? I'd like to help anyway i can.

philk · 2016-08-24T05:33:25Z

#1377 might be what you're looking for

On Tue, Aug 23, 2016 at 10:14 PM Abhinav Pobbati notifications@github.com
wrote:

@pbitty https://github.com/pbitty Have you made any progress on this
issue? I'd like to help anyway i can.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#11935 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AABzei0LS2XunOv4fD_tQoLN4eq-LrKJks5qi9MugaJpZM4FhY1C
.

bjoernhaeuser · 2016-08-24T19:28:08Z

@philk I think the mentioned PR is not what we are looking for. Is there a typo or similar?

andrewsykim · 2016-08-24T19:32:00Z

kubernetes-retired/contrib#1311 is probably the link you are looking for. It references other PRs that have been opened regarding cluster autoscaler for AWS

philk · 2016-08-24T20:29:49Z

Oh, yeah I was on mobile and didn't realize which repo I was in kubernetes-retired/contrib#1377 was what I meant. (Though 1311 above is useful too)

aliakhtar · 2016-09-11T07:31:17Z

What's the status on this feature? I came across this blog: http://blog.kubernetes.io/2016/07/autoscaling-in-kubernetes.html which said AWS auto scaling would be coming in 1.3. The current stable version is 1.3.6, but I can't find any info on this.

The AWS getting started doc says the max / desired instances in the AWS auto scaling group can be set, but do the new AWS instances auto register themselves?

fgrzadkowski · 2016-09-12T10:01:51Z

This blogpost was released after 1.3 and said that AWS support will be ready soon. AFAIK it's already the case.

@mwielgus Can you please verify? Are there instructions how to set it up? Have we released the image? Does it require kubernetes 1.4 or is it just starting different add-on?

andrewsykim · 2016-09-12T13:19:50Z

There's a README here. I don't think an official image was made so you would have to fork the contrib repo and build/push the image yourself for now.

btdlin · 2016-09-26T19:07:44Z

New to the thread, trying to set up auto-scaling with k8s in aws. Is this supported now? I check the README, but not sure exactly what needs to be done to build/push the image. Any update would be really appreciated. Thanks.

andrewsykim · 2016-09-26T19:25:13Z

@btdlin you have to build your own docker image on whatever revision we started supporting cluster autoscaler on AWS and push it to your own registry. If you don't want to do that my company has published a public image for our own use cases which has AWS support on it wattpad/cluster-autoscaler:v1.1.

jimmycuadra · 2016-09-26T23:20:19Z

Is there going to be an official image for the autoscaler? Why make people build it for themselves?

btdlin · 2016-09-26T23:59:49Z

Thanks @andrewsykim . Looks like v1.4 just released a few hours ago, do we know if autoscaler in aws is included in v1.4?

andrewsykim · 2016-09-27T00:08:07Z

@jimmycuadra yes I believe there will be an official docker image already, we just didn't know if it the published one supported aws as a cloud provider so we built our own.

fgrzadkowski · 2016-09-27T13:58:58Z

@mwielgus Can we make sure that cluster autoscaler image is released to an official repo? And I think we should close this issue now, as we support AWS :)

danbeaulieu · 2016-09-27T14:21:35Z

@fgrzadkowski Hi, I am very much interested in this feature but I find the lack of documentation to be an issue. The README leaves a bit to be desired.

How are instances scaled in? ie is there any Rhyme or reason to which is picked
Is it possible to have heterogeneous instance types in the cluster?
By what metrics can I use to scale? CPU usage? container count? etc

I am a heavy AWS user but new to Kubernetes if that helps understand the audience.

jimmycuadra · 2016-09-27T23:45:19Z

Once there is an official image for it, let's make sure the docs for the autoscaler mention where it is!

fgrzadkowski · 2016-09-28T10:07:34Z

We already have a PR inflight for better documentation - kubernetes-retired/contrib#1731

@mwielgus I think that to improve documentation we will also need:

point from AWS and GCE readme files to the main one.
upgrade official docs for cluster autoscaling (http://kubernetes.io/docs/admin/cluster-management/#cluster-autoscaling) to say that we also support AWS.

@mwielgus Can we close this issue as fixed?

andyxning · 2016-11-22T02:49:19Z

@erictune @sstarcher Does monitor the Pending pods means that we can use the InsufficientCPU or InsufficientMemory event to get the same result and based on these events we can add new nodes to the cluster.

These two type events will be emit when pods can not be scheduled when the required resource(CPU/Memory) can not be fulfilled.

fgrzadkowski · 2016-11-22T10:00:18Z

Quick comments - events where not designed to be the API that other components should depend on. That's why we added Scheduled pod condition with reason Unschedulable.

andyxning · 2016-11-22T10:17:20Z

@fgrzadkowski IIUC, you mean that event is not reliable and event is not designed to be depended on for usage like this. The most reliable way is to querying pod info and checking for Scheduled value of a pod spec.

After reading the source code, it seems that it will emit a FailedScheduling event before update the pod status.

fgrzadkowski · 2016-11-22T10:33:30Z

Scheduler will emit events, but they are not considered to be part of the api for other components.

Yes, you should just check pod condition, which is part of PodStatus.

motymichaely · 2016-12-22T16:55:26Z

Hey team, Is there any k8s version aimed for the this feature to be released? Any suggestions for implementing this with AWS ASG + custom metrics?

mwielgus · 2016-12-22T17:11:23Z

The current version of Cluster Autoscaler (0.4.0) supports AWS ASG. Closing the issue.

mwielgus · 2016-12-22T17:13:51Z

BTW, Cluster Autoscaler is not driven by metrics but rather by the real need for a new node because some pods cannot schedule.

justinsb self-assigned this Jul 30, 2015

justinsb added the area/platform/aws label Aug 1, 2015

erictune added the sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. label Aug 5, 2015

justinsb removed their assignment Aug 12, 2015

mbforbes added sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. labels Aug 16, 2015

roberthbailey added the team/community label Aug 27, 2015

bgrant0607-nocc added team/none and removed team/none labels Sep 27, 2015

piosz added the help-wanted label Feb 15, 2016

piosz added priority/backlog Higher priority than priority/awaiting-more-evidence. and removed priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. labels Feb 15, 2016

piosz mentioned this issue Mar 31, 2016

AWS Feature: Autoscaling of nodes #23480

Closed

justinsb added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Apr 1, 2016

justinsb added this to the v1.3 milestone Apr 1, 2016

bgrant0607 removed the help-wanted label Aug 30, 2016

jimmycuadra mentioned this issue Sep 28, 2016

Use autoscaling groups for both masters and nodes InQuicker/kaws#2

Closed

mohanbmahajan mentioned this issue Sep 28, 2016

KUBE_ENABLE_CLUSTER_AUTOSCALER using kops on AWS? kubernetes/kops#531

Closed

yagonobre mentioned this issue Nov 4, 2016

Add autoscaling policy kz8s/tack#101

Closed

justinsb self-assigned this Nov 15, 2016

mwielgus closed this as completed Dec 22, 2016

AWS: We should consider updating/reusing cluster-autoscaler to support AWS #11935

AWS: We should consider updating/reusing cluster-autoscaler to support AWS #11935

Comments

justinsb commented Jul 28, 2015

erictune commented Jul 30, 2015

erictune commented Jul 30, 2015

justinsb commented Jul 30, 2015

ecowan commented Nov 19, 2015

satheessh commented Jan 8, 2016

rafaljanicki commented Jan 12, 2016

valery-zhurbenko commented Feb 15, 2016

piosz commented Feb 15, 2016

sstarcher commented Feb 16, 2016

miguelfrde commented Feb 17, 2016

jimmycuadra commented Mar 2, 2016

dengshuan commented Mar 24, 2016

mwielgus commented Mar 24, 2016

sstarcher commented Mar 24, 2016

apobbati commented Aug 24, 2016

philk commented Aug 24, 2016

bjoernhaeuser commented Aug 24, 2016

andrewsykim commented Aug 24, 2016

philk commented Aug 24, 2016

aliakhtar commented Sep 11, 2016

fgrzadkowski commented Sep 12, 2016

andrewsykim commented Sep 12, 2016

btdlin commented Sep 26, 2016

andrewsykim commented Sep 26, 2016

jimmycuadra commented Sep 26, 2016

btdlin commented Sep 26, 2016

andrewsykim commented Sep 27, 2016

fgrzadkowski commented Sep 27, 2016

danbeaulieu commented Sep 27, 2016

jimmycuadra commented Sep 27, 2016

fgrzadkowski commented Sep 28, 2016

andyxning commented Nov 22, 2016

fgrzadkowski commented Nov 22, 2016

andyxning commented Nov 22, 2016 • edited

fgrzadkowski commented Nov 22, 2016

motymichaely commented Dec 22, 2016

mwielgus commented Dec 22, 2016

mwielgus commented Dec 22, 2016

andyxning commented Nov 22, 2016 •

edited