Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster versioning #4855

Closed
2 of 4 tasks
zmerlynn opened this issue Feb 26, 2015 · 30 comments
Closed
2 of 4 tasks

Cluster versioning #4855

zmerlynn opened this issue Feb 26, 2015 · 30 comments
Assignees
Labels
area/upgrade kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle.

Comments

@zmerlynn
Copy link
Member

This issue is a proposal and collection of work-items for the cluster versioning mechanics for 1.0. It is meant to contain concepts from #2524 and decisions from the 2015/02 Kubernetes Meet-up as to what to encapsulate/cut for 1.0:

Versioning requirements for kubelet:

  • kubelet version tuple must be reported for the cross-product of, at least, (kubelet, docker, kernel).
    apiserver. Since, as we'll see in the Upgrade section, these versions are bundled, the tuple itself may be encodable linearly (i.e. "kubelet image 97" for a given cloud provider / node image.) (Kubelet publish node components' version #5948)
  • It should be easy to query the current versions of all kubelets from the API.
  • apiserver must speak only the capabilities of the least capable kubelet software version (the first part of the tuple)
  • apiserver of version n.x must be accessible to a kubelet of version n.y if x>=y. When in doubt, the versioning policy trumps this issue as to what API versions are required to interoperate, except that for 1.0 master components are always allowed to assume they're ahead of kubelet.
@zmerlynn zmerlynn added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. area/hosting sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. labels Feb 26, 2015
@zmerlynn zmerlynn added this to the v1.0 milestone Feb 26, 2015
This was referenced Feb 26, 2015
@rsokolowski
Copy link
Contributor

Hi @zmerlynn , do you need any help with this issue ? If yes, which work items could I work on ? Thanks !

@roberthbailey
Copy link
Contributor

This is a P1 and needs an owner. @zmerlynn if you are planning to drive it, please assign to yourself, otherwise it sounds like @rsokolowski is interested in starting on it.

@zmerlynn
Copy link
Member Author

I think a chunk of this work is actually on Dawn, but let me figure out how to piece it out.

@zmerlynn
Copy link
Member Author

cc @dchen1107

I think the first bullet, reporting the actual version somewhere, is probably somewhere on the node team. It seems like it could hook in with the status update mechanism for #4562 so that we're just doing one status update, but we don't necessarily need to update that often.

The second bullet (keeping track of all the versions) is ostensibly the NodeController.

The third bullet, having apiserver only speak the minimum, I'm not sure where it fits: ideally in a rolling upgrade scenario, you'd have some interconnection from NodeController to the apiserver that specifies the minimum, and when the last node reports in, an event fires back to apiserver. I'm not sure what follows this pattern right now.

The last bullet is waiting on the merge of #4833, and is otherwise, I think a testing and policy bullet. Which is complicated in its own right, but doesn't need a lot of upfront design.

@alex-mohr alex-mohr changed the title Cluster Versioning and Upgrade in V1.0 upgrade: Cluster Versioning and Upgrade in V1.0 Mar 19, 2015
This was referenced Mar 27, 2015
@mbforbes
Copy link
Contributor

This is awesome, @zmerlynn—thanks for nailing down the version requirements and writing out the 1.0 upgrade plan in understandable English.

I'm going to be removing the upgrade parts of this to shove them in the other rollups (master #6075 and node #6079). That way, this can be specifically the versioning rollup and we can split the work nicely into three separate issues. I think I've covered everything upgrade-related in those, but I really liked the clear description of upgrades for 1.0 you wrote in this, so I'm going to transfer that to one or both of them.

Let me know if I missed something in this transition, and sorry if I confused anyone's issue linking by doing this!

@mbforbes mbforbes changed the title upgrade: Cluster Versioning and Upgrade in V1.0 Cluster versioning for V1.0 Mar 28, 2015
@mbforbes
Copy link
Contributor

  • assigning @zmerlynn as the overall owner to drive, even if individual work items are handled across teams

@bgrant0607
Copy link
Member

Field gate proposal: https://docs.google.com/document/d/1wuoSqHkeT51mQQ7dIFhUKrdi3-1wbKrNWeIL4cKb9zU/edit#

Note also that the concept of "profiles" is being discussed, for multiple purposes.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 2, 2018
@roberthbailey
Copy link
Contributor

@zmerlynn - can you triage this issue?

@bgrant0607
Copy link
Member

/remove-lifecycle stale
/lifecycle frozen

I'd like to keep this open. We have a number of challenges around cluster lifecycle that I'd like to consider holistically:

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 22, 2018
@bgrant0607
Copy link
Member

cc @jagosan

@bgrant0607
Copy link
Member

bgrant0607 commented Jan 26, 2018

Operations:

  • Turn up
  • Upgrade
  • Downgrade
  • Teardown

Strawman upgrade sequence:

  1. Upgrade masters (apiserver, controller manager, scheduler)
  2. Enable new APIs
  3. Upgrade addons (caveats: don't remove old DaemonSets, maybe support API version discovery)
  4. Disable old APIs
  5. Upgrade nodes

Default storage versions can't be updated until after step 1.

@krmayankk
Copy link

@bgrant0607 why is this frozen ? it seems it was originally created for 1.0 in 2015, but has lot of relevant details that could become a documentation link or KEP for how clusters should be upgraded , but not sure i have seen any of those documentation.

@bgrant0607
Copy link
Member

@krmayankk It's fine to close this issue, though AFAIK, neither the upgrade/downgrade sequencing, nor lifecycle stages, nor teardown operations, nor the other proposals mentioned above (#4855 (comment)) have been implemented. More specific issues, such as #54522 have been filed.

/close

@k8s-ci-robot
Copy link
Contributor

@bgrant0607: Closing this issue.

In response to this:

@krmayankk It's fine to close this issue, though AFAIK, neither the upgrade/downgrade sequencing, nor lifecycle stages, nor teardown operations, nor the other proposals mentioned above (#4855 (comment)) have been implemented. More specific issues, such as #54522 have been filed.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/upgrade kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle.
Projects
Technical Debt Research
Enterprise Readiness
Development

No branches or pull requests