Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS: We should run the master in an autoscaling group of size 1 #11934

Closed
justinsb opened this issue Jul 28, 2015 · 16 comments
Closed

AWS: We should run the master in an autoscaling group of size 1 #11934

justinsb opened this issue Jul 28, 2015 · 16 comments
Assignees
Labels
priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle.

Comments

@justinsb
Copy link
Member

This will provide automatic relaunch in case of failure.

@roberthbailey
Copy link
Contributor

How does AWS handle mounting persistent disks to instances in an autoscaling group? Also, what about health checks (you also want to re-launch the VM if the VM is running but the apiserver is down)?

@roberthbailey roberthbailey added area/platform/aws sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. labels Jul 28, 2015
@jboelter
Copy link

Are the files/configuration that need to survive termination in a known location?

We could create an EBS volume and mount it in the master instance. Alternately, I think the same idea would work, but it would need to be the boot volume.

@iterion
Copy link
Contributor

iterion commented Jul 29, 2015

@jboelter We put all of the config that needs to survive on an EBS that is mounted to the master when it is initially created (not the boot volume, but a second disk that has the essential info placed on it).

@roberthbailey We can mount a blank disk or a snapshot of a disk. But, I don't think there is any way to have the ASG know to remount the disk that was used previously.

For this to come back up with the correct data we could run a script when the instance starts. That script would make some AWS API calls to try to find an existing EBS volume for the master and remount it. @justinsb might have some better solution in mind though :)

@jboelter
Copy link

@iterion perfect -- The ASG has an associated LaunchConfiguration that specifies the details. We should be able to reference a known volume id created prior. This assumes there are no race conditions w/ the volume in use after termination while a new instance is created.

Edit: It appears that the AutoScaling EBS type doesn't allow for a volume id (which would only make sense for a ASG size of 1) -- mounting w/ an init script may be the way to go. Should still be able to use a well-known volume id though.

http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-as-launchconfig-blockdev-template.html

@iterion
Copy link
Contributor

iterion commented Jul 29, 2015

@jboelter Interesting, I can't find where to specify the volume id when creating a launch configuration, perhaps I'm looking in the wrong place. It looks as if you can specify a BlockDeviceMapping. On that mapping I see that there is a way to configure an EBS instance but it only lets you specify a snapshot id.

FYI - I'm looking here: http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-as-launchconfig-blockdev-mapping.html#cfn-as-launchconfig-blockdev-mapping-ebs

@jboelter
Copy link

@iterion yeah, just noticed the same and edited my note above as you posted

@iterion
Copy link
Contributor

iterion commented Jul 29, 2015

Bummer, perhaps we could tag the ASG or launch configuration with the volume id that was used? Alternatively, we could tag the EBS with something that identifies it as the master disk for that cluster. We run the risk of having multiple disks with the same tags though.

@justinsb
Copy link
Member Author

I'm going to make an attempt at this.

I am planning on using the approach of tagging the volume and then trying to mount it as part of instance boot.

@mbforbes mbforbes added priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. labels Aug 16, 2015
@justinsb
Copy link
Member Author

Rather than have a separate process or script that: discovers the volume, tries to mount it and then starts our processes, I am experimenting with using the kubelet for this:
justinsb@334ad49

Advantages:

  • we could easily have hot-failover machines (i.e. run an auto-scaling group with multiple machines). Mounting a volume is a simple way to do leader election on many clouds/environments.

Shortcomings:

  • this requires passing an explicit volume ID in, but I hope that in future we will be able to specify volumes using something like k8s selectors & labels (Specify PersistentVolumeClaimSource by Selector, not Name #9712).
  • this requires a volume per process. This may not be a bad thing: better isolation, and volumes are pretty cheap (on AWS & GCE at least). We could implement volumes on volumes (a subdirectory on a volume, which k8s could copy/move around).
  • because of the above, there is no guarantee that we will launch everything on the same machine in a multi-machine environment. This may require some tweaks particularly during bootstrapping, and we would prefer minimal latency to etcd.

@pikeas
Copy link

pikeas commented Nov 12, 2015

+1, AWS should come up with an ASG in front of the master for self healing (in conjunction with master using an EIP, can't seem to find the issue # at the moment), or be configured with multiple masters (preferably still behind an ASG!).

@justinsb
Copy link
Member Author

Good news is I have this working on a branch. Bad news is that the diff is pretty substantial. I am cherry picking smaller PRs across so that the remaining changes become palatable!

@justinsb justinsb self-assigned this Nov 12, 2015
@jwerak
Copy link

jwerak commented May 12, 2016

Do you have a list of things which needs to be restored except etcd?

@namliz
Copy link

namliz commented Aug 18, 2016

Is it plausible to split out etcd into its own autoscaling group?
If so, you could just scale masters and the etcd cluster independently and there's no need to persist anything.

@justinsb
Copy link
Member Author

This is implemented in kops. As kube-up is in maintenance mode, it won't be implemented there.

@Zilman it's plausible, but then the etcd ASG becomes the challenging one!

@namliz
Copy link

namliz commented Aug 18, 2016

@justinsb: well, if you have an etcd ASG of size 3, seems to me like you don't really need to persist anything as at least one etcd instance is guaranteed to stay up.

@erutherford
Copy link

A 3 node etcd cluster can't operate with less than 2 nodes running. If you lose more than 1 nodes data you're restoring from backups.

Also, without running a runtime reconfig, your etcd instances are fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle.
Projects
None yet
Development

No branches or pull requests

10 participants