Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PreStart and PostStop event hooks #140

Closed
bgrant0607 opened this issue Jun 17, 2014 · 26 comments
Closed

PreStart and PostStop event hooks #140

bgrant0607 opened this issue Jun 17, 2014 · 26 comments
Labels
area/docker area/kubelet kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@bgrant0607
Copy link
Member

Many systems support event hooks for extensions. A few examples:
https://developers.google.com/appengine/docs/java/javadoc/com/google/appengine/api/LifecycleManager
http://developer.android.com/guide/components/activities.html
http://upstart.ubuntu.com/cookbook/#event
https://coreos.com/docs/launching-containers/launching/getting-started-with-systemd/
http://elasticbox.com/documentation/configuring-and-managing-boxes/start-stop-and-upgrade-boxes/
http://git-scm.com/docs/githooks.html

docker stop and restart currently send SIGTERM followed by SIGKILL, similar to many other systems (e.g., Heroku: https://devcenter.heroku.com/articles/dynos#graceful-shutdown-with-sigterm), which provides an opportunity for applications to cleanly shut down, but lacks the ability to communicate the grace period duration or termination reason and doesn't directly provide support for notifying other processes or services.

As described in the (liveness probe issue)[https://github.com//issues/66], it would be useful to support multiple types of hook execution/notification mechanisms. It would also be useful to pass arguments from clients, such as "reason" (e.g., "cancel", "restart", "reload", "resize", "reboot", "move", "host_update", "probe_failure"). Another way "reason" could be handled is with user-defined events.

In addition to pre-termination notification, we should define other lifecycle hook points, probably at least pre- and post- start and terminate.

It would be useful for post-terminate to be passed the (termination reason)[https://github.com//issues/137], which could either be successful completion, a client-provided stop reason (see above), or detailed failure reason (exit, signal, OOM, container creation error, docker crash, machine crash, lost/ghost).

If the application generated an assertion failure or exception message, a post-termination hook could copy it to (/run/status.txt)[https://github.com//issues/139].

It would also be useful to be able to control (restart behavior)[https://github.com//issues/127] from a hook. We'd need a convenient way to carry over state from a previous execution. The simplest starting point would be for the user to keep it in a (volume)[https://github.com//issues/97].

@bgrant0607
Copy link
Member Author

Docker issues requesting postcreate hooks:
moby/moby#3317
moby/moby#252

One of these mentioned that Docker planned to support hooks for every event, but that was quite a while ago, so I'm not sure whether that's still in plan.

@bgrant0607
Copy link
Member Author

Just got support from tianon for runin/exec/enter support: moby/moby#1228
So we can use that to execute hooks inside containers

@bgrant0607
Copy link
Member Author

I investigated Docker's event stream. It provides container id and event type (e.g., start, stop). It doesn't provide further details about the events. Also, it's obviously asynchronous with respect to the events. At least pre-start and post-termination event hook commands would be most useful if executed synchronously, inline with container execution.

Docker restart allows restarting the same command in the same container with the same container id and filesystem, even after the death of the previous process. There's no way to change the command executed AFAICT, however. If the forthcoming "runin"/"exec" allowed execution in dead containers, we could maybe use it for post-termination hooks. Pre-start hook commands look ugly without Docker support. No response yet to my docker-dev question about whether Docker is actually planning to add hook support.

We could override the container entrypoint with the pre-start hook command and then use "runin" to execute the real entrypoint, but the container's status, wait, restart, etc. would be broken. Creating a new container image that included the pre-start command, actual entrypoint, and post-termination hook might work, but we'd need to carefully propagate arguments, signals, exit status, etc.

The main reason for a post-start hook would be consistency, but it also might be convenient for start actions that don't block the start of the container's entrypoint, such as registration in a third-party discovery service, pushing events to pubsub, etc. "runin" should just work, though there may be a race if the application immediately terminated. Probably it would be useful to serialize execution with respect to later hooks on the same entity.

If we wrapped the application we could intercept SIGTERM in order to execute the pre-termination hook, and just pass the signal on to the application if no hook were specified. Again, we probably want to serialize with respect to the post-termination hook. It might not be super-useful to execute the pre-termination hook in the case that it weren't a planned container stop, but I'm not sure whether it's more natural to execute it prior to the post-termination hook regardless or whether it would be annoying for the pre-termination hook to execute when the application was already dead. OTOH, that case would probably need to be handled since the application could die at any time, including concurrently with the start of the pre-termination hook.

Asynchronous webhooks would be comparatively easy to support. The most difficult issue is what to do about auth. If generated by a command in the container, presumably they could authenticate as the container, so it looks appealing to only support command hooks. However, there are situations where we wouldn't have an obvious container to execute commands in, such as for pod lifecycle hooks or, even more obviously, replicationController and service lifecycle event hooks.

@smarterclayton
Copy link
Contributor

We could override the container entrypoint with the pre-start hook command and then use "runin" to execute the real entrypoint, but the container's status, wait, restart, etc. would be broken. Creating a new container image that included the pre-start command, actual entrypoint, and post-termination hook might work, but we'd need to carefully propagate arguments, signals, exit status, etc.

We tried make this work, but it felt wrong on all sorts of levels and ultimately we felt there were two cases we wanted to handle:

  • hooks that either need the container context (and as such executing outside the process namespace would be pointless), or if interrupted by container shutdown would not be internally inconsistent. Pre-termination is a good example
  • hooks that should be outside of a container, because they need to continue to run even if a container fails. Deploy across multiple containers is a good example, or post-termination.

I don't think you need to subclass the entrypoint in the image - for the former, you could use "docker exec in" (add process to namespace) and for the latter, you can use restart-once containers (outside or inside a pod).

There's some crossover here between "event hooks" and "intent/application hooks" - reacting to changes to the infrastructure, vs reacting to user intent across multiple pods. The "intent hooks" tend to be things that I think of as wanting the auth/orchestration parts of the downward-facing API (I want to wait for this label query to reach X instances, and then update another replication controller to go to Y), as well as being more like regular jobs in their own right.

@thockin
Copy link
Member

thockin commented Oct 7, 2014

Is there more to do here, or should we close this and open new issues?

@bgrant0607
Copy link
Member Author

I was waiting for #1445 to go in before closing this one.

@bgrant0607 bgrant0607 changed the title Event hooks PreStart and PostStop event hooks Oct 8, 2014
@bgrant0607 bgrant0607 removed this from the v0.7 milestone Oct 8, 2014
@bgrant0607
Copy link
Member Author

Removed from milestone and repurposed for PreStart and PostStop hooks, which I've always thought would be the most useful hooks to provide, for reasons discussed above and in several other issues.

Hooks are being discussed for Docker:
moby/moby#6982

Our users' hooks should execute within the container context. The case for PreStart executing within the container context is to initialize the filesystem, and the case for PostStop is extracting data from the filesystem.

What's attractive about Docker executing the hooks in the host context is that would be provide a hook for the management layer -- Kubelet in our case. The people who advocated executing them in the host context said that nsinit/exec could be then used to enter the container in the hook. That SGTM, since it would give us more control (e.g., allowing us to use Docker container restarts). It would still require the same changes to libcontainer and Docker, however, to decouple container lifetime from process lifetime.

/cc @vishh

@bgrant0607 bgrant0607 added the priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. label Dec 3, 2014
@dchen1107 dchen1107 added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Feb 4, 2015
@bgrant0607
Copy link
Member Author

Proof-of-concept for prestart here: #3067 (comment)

wking added a commit to wking/cri-o that referenced this issue Feb 15, 2018
The underlying github.com/containers/storage Store interface exposes
both CreateContainer/DeleteContainer and Mount/Unmount, but in CRI-O
Mount always followed shortly after CreateContainer and Unmount always
preceded DeleteContainer.  This commit removes the unused distinction.

The only user-visible change is that Unmount used to be called at
stop-time (e.g. in StopPodSandbox and ContainerStop), but now Unmount
happens via DeleteContainer at remove-time (e.g. in RemovePodSandbox
and ContainerServer.Remove).  That early removal is a CRI-O
optimization dating back to at least c88bc13 (Implement kpod rm,
2017-08-30, cri-o#839).  But the early removal might break eventual
post-stop event hooks [1,2].  And if we don't need to support
post-stop event hooks (the closest Kubernetes has now is pre-stop
hooks [3,4]), we can trigger a full removal (not just an unmount)
immediately after a successful stop.

[1]: kubernetes/kubernetes#140
[2]: https://github.com/kubernetes/kubernetes/blob/release-1.5/docs/proposals/container-runtime-interface-v1.md#container-lifecycle-hooks
[3]: https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.9/#lifecycle-v1-core
[4]: https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/#container-hooks

Signed-off-by: W. Trevor King <wking@tremily.us>
@spiffxp spiffxp removed the triaged label Mar 16, 2018
@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. and removed kind/enhancement labels Jun 5, 2018
seans3 pushed a commit to seans3/kubernetes that referenced this issue Apr 10, 2019
marun added a commit to marun/kubernetes that referenced this issue Jun 4, 2020
@n4j
Copy link
Member

n4j commented Jun 24, 2021

@bgrant0607 This issue is open since long is it ok if we close this?

@timchenxiaoyu
Copy link
Contributor

how to implement prestart hook ?

@adityajoshi12
Copy link

adityajoshi12 commented May 23, 2022

There is no updates on the prestart hook, you can use the init container as an alternative to prestart hook.

@sftim
Copy link
Contributor

sftim commented Dec 21, 2022

Are init containers a suitable replacement for preStart?

For Jobs, you can also use a pod failure policy to abort an entire Pod when a specific container fails. Perhaps extending that to other workload controllers, plus the existing init container support, is enough?

@thockin thockin closed this as completed Dec 21, 2022
sttts pushed a commit to sttts/kubernetes that referenced this issue Sep 13, 2023
@debu99
Copy link

debu99 commented Nov 26, 2023

i prefer to make the pod zone awareness with environment variable in preStart, but init container can only share volume

linxiulei pushed a commit to linxiulei/kubernetes that referenced this issue Jan 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docker area/kubelet kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

No branches or pull requests