Proposal: add pod lifecycle event generator for kubelet #12802

yujuhong · 2015-08-17T15:47:32Z

No description provided.

k8s-bot · 2015-08-17T16:07:36Z

GCE e2e build/test passed for commit 96d3e5462e5d292c76efcea9fad174790323a0f1.

k8s-bot · 2015-08-17T18:06:09Z

GCE e2e build/test passed for commit bf40ff2d4d6f8ded560161ead6027118e18204a8.

brendandburns · 2015-08-17T18:10:45Z

cc @brendandburns

k8s-bot · 2015-08-17T18:12:55Z

GCE e2e build/test passed for commit 00fcbb013a76f4b689c11fce236be0cf61337134.

k8s-bot · 2015-08-17T19:28:17Z

GCE e2e build/test passed for commit 6d21bdcaba1b5b3d2b90f20af23ee3b931924a06.

k8s-bot · 2015-08-17T22:20:47Z

GCE e2e build/test failed for commit 763719084c5c777e746b5afaea51ffe87594681d.

k8s-bot · 2015-08-17T22:40:14Z

GCE e2e build/test passed for commit 919dbf60516ebbd831a1a0d38e2dc99bce63d345.

ironcladlou · 2015-08-19T23:12:49Z

docs/proposals/pod-lifecycle-event-generator.md

+- Probing: Now that pod workers only wake up on events, we need to
+  rethink how liveness/readiness probing should be implemented. One
+  solution is to create a separate probing component that can be
+  plugged in to PLEG to generate corresponding pod lifecylce


This is cool. I like the idea of a plugin to present the probe outcomes as events.

ironcladlou · 2015-08-19T23:25:26Z

This would also enhance the apparent responsiveness of UIs in some cases (openshift/origin#3781). As just one example in OpenShift, a deployment's outcome is tied closely to the status of a Pod, and by proxy its containers; but no matter how fast the deployment container returns, the apparent running time can be inflated artificially by up to 10 seconds depending on where we land in the pod polling.

dchen1107 · 2015-08-20T00:31:59Z

docs/proposals/pod-lifecycle-event-generator.md

+
+### Bootstrapping/restart
+
+Upon restart, PLEG needs to perform a reslit to retrieve all container


Please indicate this is kubelet restart case here

s/reslit/relist

Docker restart is another case we have to detect and handle here too. You need to detect event_stream is disconnect. When docker is starting up, it might send you tons of events of old containers.

dchen1107 · 2015-08-20T07:05:41Z

@yujuhong On the way back home, I realized that cadvisor only support 4 types events: containerCreation, containerDeletion, OOM, and OOMKilled, and none of them are the key event you want or sufficient enough for you to make any decision:

containerCreation doesn't mean we can start a container.
containerDeletion is only fired when container GC removes dead container
The only termination event you have is OOMKilled when reach cgroup's memory limit
When sys oom and kill a random process, you won't have container information. This is known issue in cAdvisor, but I don't think it is easy to fix.
Other failures cause a termination of container, there is no event generated.

From above, you can see we pretty much cannot rely on cAdvisor's event stream at all, and I don't think there is an easy to add those missing events to cadvisor. Here is the code defined event:
https://github.com/google/cadvisor/blob/2a022a4a74f4f994c1ae7684de7655aaa0a898ba/info/v1/container.go#L521

Let's just stick on docker events for now. For rkt case, we could invoke relist operation all the time until they have event stream support.

cc/ @kubernetes/goog-node

thereallukl · 2015-08-20T08:55:49Z

Adding such strong dependency on cadvisor to provide information about lifecycle might block adding new container technologies in the future ( one might want to add for example lxc, qemu, ) and dependency for those drivers on something that is just metering component does not sound very good.

That being said I prefer option 1 for source of PLEG events ( Each runtime provides events by itself).

vishh · 2015-08-20T17:30:34Z

I believe the intention is to abstract out container runtimes using the cadvisor interface. Since cadvisor is already trying to handle container runtimes as first class entities, it seems natural to use it as an abstraction layer.
+1 to @dchen1107's arguments against using raw cadvisor events. May be we can extend cadvisor to support docker events?

yujuhong · 2015-08-20T18:27:06Z

This would also enhance the apparent responsiveness of UIs in some cases (openshift/origin#3781). As just one example in OpenShift, a deployment's outcome is tied closely to the status of a Pod, and by proxy its containers; but no matter how fast the deployment container returns, the apparent running time can be inflated artificially by up to 10 seconds depending on where we land in the pod polling.

This should definitely improve the response latency provided that kubelet is not already overwhelmed.

dchen1107 · 2015-08-20T18:34:33Z

May be we can extend cadvisor to support docker events?

I am not sure this is a good idea though, but could be convinced. Docker is one implementation of container runtime, rkt is another one. To extend cadvisor to support docker event, we have to complicate cadvisor to make it composable too.

vishh · 2015-08-20T18:38:53Z

We will have to abstract out events from multiple container runtimes in
kubelet anyways. I assume your argument is whether cAdvisor is the right
abstraction layer?

On Thu, Aug 20, 2015 at 11:35 AM, Dawn Chen notifications@github.com
wrote:

May be we can extend cadvisor to support docker events?
I am not sure this is a good idea though, but could be convinced. Docker
is one implementation of container runtime, rkt is another one. To extend
cadvisor to support docker event, we have to complicate cadvisor to make it
composable too.

—
Reply to this email directly or view it on GitHub
#12802 (comment)
.

yujuhong · 2015-08-20T19:11:18Z

Some clarifications since we (me, @vishh and @dchen1107) discussed this offline :-)

@yujuhong On the way back home, I realized that cadvisor only support 4 types events: containerCreation, containerDeletion, OOM, and OOMKilled, and none of them are the key event you want or sufficient enough for you to make any decision:

containerCreation doesn't mean we can start a container.

containerCreation means that we tried to start a container. It may fail to start because of other reasons (image, etc), but we'll receive a containerDeletion in those cases.

@vishh brought up a valid point that this would make kubelet interprets "container starts" differently today. I agree with that, but personally I think this does not affect how kubelet works internally, and is not that critical.

containerDeletion is only fired when container GC removes dead container
The only termination event you have is OOMKilled when reach cgroup's memory limit
When sys oom and kill a random process, you won't have container information. This is known issue in cAdvisor, but I don't think it is easy to fix.
Other failures cause a termination of container, there is no event generated.

The conclusion from the discussion is that cAdvisor would send out a Deletion event if a container terminates because cgroups cleanup upon pid 1 exiting (thanks @vishh for pointing that out). I can verify this since I have tested my prototype by docker stop containers. kubelet reacts to that immediately.

From above, you can see we pretty much cannot rely on cAdvisor's event stream at all, and I don't think there is an easy to add those missing events to cadvisor. Here is the code defined event:
https://github.com/google/cadvisor/blob/2a022a4a74f4f994c1ae7684de7655aaa0a898ba/info/v1/container.go#L521

Let's just stick on docker events for now. For rkt case, we could invoke relist operation all the time until they have event stream support.

yujuhong · 2015-08-20T19:17:46Z

We will have to abstract out events from multiple container runtimes in
kubelet anyways. I assume your argument is whether cAdvisor is the right
abstraction layer?

I agree that we need an abstraction for events across different container runtimes.

cadvisor provides the bare-minimum event stream for kubelet to work, and we can move faster with it for now. That's why I chose cadvisor as a starting point. I think it's worth giving it a try since it should be easy to swap out the container event watcher.

That said, if the consensus is to use docker event, I'd be okay with that.

yifan-gu · 2015-08-20T19:21:59Z

I think we need to define the event stream interface in kubelet. Whether we implement it using cadvisor or docker event stream should not matter.

ejemba · 2015-10-15T12:52:46Z

docs/proposals/pod-lifecycle-event-generator.md

+    ContainerStopped      PodLifeCycleEventType = "ContainerStopped"
+    NetworkSetupCompleted PodLifeCycleEventType = "NetworkSetupCompleted"
+    NetworkFailed         PodLifeCycleEventType = "NetworkFailed"
+)


Regarding the issue #10288, I think you should add another high level PodLifeCycleEventType "ContainerDied" otherwise we will have the same problem ,even with the your new system.
Considering what events docker is capable of firing ( cf. https://docs.docker.com/reference/api/docker_remote_api_v1.17/#monitor-docker-s-events ) this PodLifeCycleEventType won't be inexact.

@ejemba this list is not complete, but it includes all the types we use for now. We can augment this whenever we need :)

zhengguoyong · 2015-11-06T08:28:42Z

👍

dchen1107 · 2015-12-02T00:25:02Z

@yujuhong Can you update this based on our current roadmap, design and implemention.

k8s-bot · 2015-12-02T02:11:56Z

GCE e2e test build/test passed for commit 919dbf60516ebbd831a1a0d38e2dc99bce63d345.

k8s-bot · 2016-01-12T02:59:26Z

GCE e2e build/test failed for commit 4faba2a3957276992366ecf02969ff57cb7f87af.

yujuhong · 2016-01-12T18:17:41Z

@yujuhong Can you update this based on our current roadmap, design and implemention.

@dchen1107, I updated the proposal. Please take a look, thanks!

k8s-bot · 2016-01-12T18:53:23Z

GCE e2e test build/test passed for commit 501ced30a43a23e34227b5868affce0447d01a80.

k8s-bot · 2016-01-12T19:12:19Z

GCE e2e test build/test passed for commit b8b532b.

eparis · 2016-02-01T23:56:34Z

@k8s-bot test this issue: #IGNORE

Tests have been pending for 24 hours

k8s-bot · 2016-02-02T01:50:31Z

GCE e2e test build/test passed for commit b8b532b.

dchen1107 · 2016-02-02T18:22:40Z

LGTM

k8s-github-robot · 2016-02-02T18:26:44Z

Automatic merge from submit-queue

Auto commit by PR queue bot

googlebot added the cla: yes label Aug 17, 2015

roberthbailey assigned dchen1107 Aug 17, 2015

yujuhong force-pushed the kubelet_proposal branch 2 times, most recently from bf40ff2 to 00fcbb0 Compare August 17, 2015 17:49

yujuhong mentioned this pull request Aug 17, 2015

Improve kubelet's scalability and performance #12540

Closed

yujuhong force-pushed the kubelet_proposal branch from 00fcbb0 to 6d21bdc Compare August 17, 2015 18:45

yujuhong mentioned this pull request Aug 17, 2015

Restructure SyncPod in kubelet #12826

Closed

yujuhong force-pushed the kubelet_proposal branch from 6d21bdc to 7637190 Compare August 17, 2015 21:15

yujuhong mentioned this pull request Aug 17, 2015

WIP: A prototype for pod lifecycle event generator #12829

Closed

yujuhong force-pushed the kubelet_proposal branch from 7637190 to 919dbf6 Compare August 17, 2015 21:49

ironcladlou reviewed Aug 19, 2015
View reviewed changes

dchen1107 reviewed Aug 20, 2015
View reviewed changes

dchen1107 added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Aug 20, 2015

yujuhong mentioned this pull request Sep 9, 2015

Add a client/server implementation of the container runtime #13768

Closed

eparis mentioned this pull request Sep 12, 2015

githubmunger: label PRs based on certain files touched kubernetes-retired/contrib#79

Closed

yujuhong mentioned this pull request Sep 15, 2015

kubelet: add a docker pod lifecycle event generator #13990

Closed

This was referenced Sep 23, 2015

If a container dies in a pod, we should throw an event #10288

Closed

Fail to update a pod that consumes its pod info via the downward API volume #14263

Closed

k8s-github-robot added the kind/design Categorizes issue or PR as related to design. label Oct 9, 2015

ejemba reviewed Oct 15, 2015
View reviewed changes

yujuhong mentioned this pull request Oct 23, 2015

Find alternative mechanism to resend pod status in kubelet #16195

Closed

yujuhong force-pushed the kubelet_proposal branch from 919dbf6 to 4faba2a Compare January 12, 2016 02:19

yujuhong force-pushed the kubelet_proposal branch 2 times, most recently from 501ced3 to f3f3d6f Compare January 12, 2016 18:16

Proposal: add pod lifecycle event generator for kubelet

b8b532b

yujuhong force-pushed the kubelet_proposal branch from f3f3d6f to b8b532b Compare January 12, 2016 18:16

yifan-gu added the e2e-not-required label Jan 12, 2016

dchen1107 added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 2, 2016

k8s-github-robot pushed a commit that referenced this pull request Feb 2, 2016

Merge pull request #12802 from yujuhong/kubelet_proposal

635cf58

Auto commit by PR queue bot

k8s-github-robot merged commit 635cf58 into kubernetes:master Feb 2, 2016

yujuhong deleted the kubelet_proposal branch February 23, 2016 19:46

xingzhou pushed a commit to xingzhou/kubernetes that referenced this pull request Dec 15, 2016

Merge pull request kubernetes#12802 from yujuhong/kubelet_proposal

99dd1b7

Auto commit by PR queue bot

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: add pod lifecycle event generator for kubelet #12802

Proposal: add pod lifecycle event generator for kubelet #12802

yujuhong commented Aug 17, 2015

k8s-bot commented Aug 17, 2015

k8s-bot commented Aug 17, 2015

brendandburns commented Aug 17, 2015

k8s-bot commented Aug 17, 2015

k8s-bot commented Aug 17, 2015

k8s-bot commented Aug 17, 2015

k8s-bot commented Aug 17, 2015

ironcladlou Aug 19, 2015

ironcladlou commented Aug 19, 2015

dchen1107 Aug 20, 2015

dchen1107 Aug 20, 2015

dchen1107 Aug 20, 2015

dchen1107 commented Aug 20, 2015

thereallukl commented Aug 20, 2015

vishh commented Aug 20, 2015

yujuhong commented Aug 20, 2015

dchen1107 commented Aug 20, 2015

vishh commented Aug 20, 2015

yujuhong commented Aug 20, 2015

yujuhong commented Aug 20, 2015

yifan-gu commented Aug 20, 2015

ejemba Oct 15, 2015

yujuhong Oct 16, 2015

zhengguoyong commented Nov 6, 2015

dchen1107 commented Dec 2, 2015

k8s-bot commented Dec 2, 2015

k8s-bot commented Jan 12, 2016

yujuhong commented Jan 12, 2016

k8s-bot commented Jan 12, 2016

k8s-bot commented Jan 12, 2016

eparis commented Feb 1, 2016

k8s-bot commented Feb 2, 2016

dchen1107 commented Feb 2, 2016

k8s-github-robot commented Feb 2, 2016


		### Bootstrapping/restart

		Upon restart, PLEG needs to perform a reslit to retrieve all container

Navigation Menu

Proposal: add pod lifecycle event generator for kubelet #12802

Proposal: add pod lifecycle event generator for kubelet #12802

Conversation

yujuhong commented Aug 17, 2015

k8s-bot commented Aug 17, 2015

k8s-bot commented Aug 17, 2015

brendandburns commented Aug 17, 2015

k8s-bot commented Aug 17, 2015

k8s-bot commented Aug 17, 2015

k8s-bot commented Aug 17, 2015

k8s-bot commented Aug 17, 2015

ironcladlou Aug 19, 2015

Choose a reason for hiding this comment

ironcladlou commented Aug 19, 2015

dchen1107 Aug 20, 2015

Choose a reason for hiding this comment

dchen1107 Aug 20, 2015

Choose a reason for hiding this comment

dchen1107 Aug 20, 2015

Choose a reason for hiding this comment

dchen1107 commented Aug 20, 2015

thereallukl commented Aug 20, 2015

vishh commented Aug 20, 2015

yujuhong commented Aug 20, 2015

dchen1107 commented Aug 20, 2015

vishh commented Aug 20, 2015

yujuhong commented Aug 20, 2015

yujuhong commented Aug 20, 2015

yifan-gu commented Aug 20, 2015

ejemba Oct 15, 2015

Choose a reason for hiding this comment

yujuhong Oct 16, 2015

Choose a reason for hiding this comment

zhengguoyong commented Nov 6, 2015

dchen1107 commented Dec 2, 2015

k8s-bot commented Dec 2, 2015

k8s-bot commented Jan 12, 2016

yujuhong commented Jan 12, 2016

k8s-bot commented Jan 12, 2016

k8s-bot commented Jan 12, 2016

eparis commented Feb 1, 2016

k8s-bot commented Feb 2, 2016

dchen1107 commented Feb 2, 2016

k8s-github-robot commented Feb 2, 2016