Evaluate the benefits of adopting docker event stream #16831

yujuhong · 2015-11-05T02:08:14Z

In #12540, docker event stream was proposed as part of the pod lifecycle event generator (PLEG) to reduce number of docker operations.

However, even without adopting the docker event stream, we can still implement a PLEG solely by periodically relisting the containers. The generic PLEG in #13571 is an example.
This would still improve the average resource usage of kubelet and docker because only one goroutine queries docker at a higher frequency (as opposed to all pod workers). The drawback, on the other hand, is that kubelet's reaction time to container events would be equal or greater than the relist period. We should try to understand the limit of pure relisting by running some micro-benchmark using docker, so that we know how much benefits the container event stream can bring us.

Metrics:

latency
cpu usage percentiles

Parameters to vary:

op: list, inspect
period
number of containers
GCE machine size

The next step would be benchmarking the docker event stream.

/cc @kubernetes/goog-node

Random-Liu · 2015-11-05T02:12:03Z

OK, will do! This is indeed necessary for future work. :)

vishh · 2015-11-05T18:24:14Z

+1 for starting with the Generic PLEG.
One thing to consider is the resource overhead of periodic relisting on the
docker daemon.

On Wed, Nov 4, 2015 at 6:12 PM, taotao notifications@github.com wrote:

OK, will do! This is indeed necessary for future work. :)

—
Reply to this email directly or view it on GitHub
#16831 (comment)
.

yujuhong · 2016-02-12T02:08:07Z

Now that we've added both a generic PLEG and a cache. We should re-evaluate the benefit of adopting the docker event stream and whether we should prioritize it.

Random-Liu · 2016-02-12T02:36:09Z

Here is some benchmark result for this:
https://docs.google.com/document/d/1_mmIFY6MU8SovilagiAFXXwNSBHzdJTCdSj_8KdgZVg/edit?usp=sharing

MHBauer · 2016-05-17T22:11:56Z

I am familiar with the docker side of events. With a little guidance I can help do the work in the kube code.

timothysc · 2016-05-19T15:43:10Z

/cc @rrati fyi to keep on the radar.

Random-Liu · 2016-05-19T16:15:36Z

@MHBauer We are redefining the container runtime interface now #22964.
At least now, event is not in our short plan, because not all runtimes support event now, e.g. rkt. And according to our measurement, event stream does improve the performance, but the current polling based implementation is enough for now.

/cc @yujuhong

yujuhong · 2016-05-20T00:02:07Z

@MHBauer that's great.

As @Random-Liu pointed out, event stream will improve performance and resource usage for kubelet, but since 1) not all runtimes support this and 2) the benefits significant enough, we didn't prioritize switching to the event stream.

The problem is a little bit more complicated than simply interfacing with the docker event stream. If you are interested, here is some background.

Kubelet has a Pod Lifecycle Event Generator, PLEG that is compatible with all runtimes. PLEG does basically two things:

periodically polls docker (via docker ps -a) and compare the current container list with the last one. If there is any change, it generates an event.
In addition to generating an event, it also does docker inspect calls and updates an in-memory pod status cache.

Kubelet has a per-pod goroutine (aka a pod worker) checking a pod periodically. Every time a worker syncs a pod, it gets the current status from the pod status cache to avoid hitting docker directly. Before the next sync can start, the worker will need to wait until all the side-effects (i.e., events) from the previous sync have been observed by PLEG and the in-memory cache has been updated accordingly. This is important because otherwise the worker may create the same container again! To enforce this rule, PLEG records a timestamp on each relisting, to indicate how fresh the cache is. The worker will block until the cache is newer than its last sync.

If we want to use switch to docker event stream today, we'll need to

add a docker-specific PLEG.
handle the working syncing problem.

(2) is required because if kubelet doesn't receive any events over a period of time, it'd not know if there are truly no events, or the events are simply delayed. This is not the case for docker ps -a with an absolute timestamp to show how fresh the cache is. This issue can be handled by recording expected events, which requires modifying a non-trivial amount of kubelet code.

fejta-bot · 2017-12-26T00:01:39Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

yujuhong · 2017-12-27T18:46:16Z

This issue can be adapted to evaluate whether we need to support event stream in CRI. It is very low priority right now unless the need can be identified clearly.

fejta-bot · 2018-03-27T19:23:52Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2018-04-26T19:41:29Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

fejta-bot · 2018-05-26T20:27:04Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

yujuhong added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/scalability Categorizes an issue or PR as relevant to SIG Scalability. sig/node Categorizes an issue or PR as relevant to SIG Node. labels Nov 5, 2015

yujuhong assigned Random-Liu Nov 5, 2015

dchen1107 added this to the v1.2-candidate milestone Nov 5, 2015

yujuhong mentioned this issue Nov 6, 2015

RFC: Determine the scalability goal for v1.2 kubelet #16943

Closed

aronchick modified the milestone: v1.2-candidate Dec 25, 2015

aronchick mentioned this issue Dec 25, 2015

[User Story] 3x Number of Pods Per Node #19094

Closed

8 tasks

yujuhong mentioned this issue Feb 12, 2016

Should we use Docker events? #8756

Closed

yujuhong mentioned this issue Mar 2, 2016

Improve kubelet's scalability and performance #12540

Closed

wking mentioned this issue Aug 3, 2016

runtime: Add an 'event' operation for subscribing to pushes opencontainers/runtime-spec#508

Closed

yujuhong unassigned Random-Liu Jun 2, 2017

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 26, 2017

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 27, 2018

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Apr 26, 2018

k8s-ci-robot closed this as completed May 26, 2018

Random-Liu mentioned this issue Jun 29, 2021

Prevent Kubelet from incorrectly interpreting "not yet started" pods as "ready to terminate pods" by unifying responsibility for pod lifecycle into pod worker #102344

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluate the benefits of adopting docker event stream #16831

Evaluate the benefits of adopting docker event stream #16831

yujuhong commented Nov 5, 2015

Random-Liu commented Nov 5, 2015

vishh commented Nov 5, 2015

yujuhong commented Feb 12, 2016

Random-Liu commented Feb 12, 2016

MHBauer commented May 17, 2016

timothysc commented May 19, 2016

Random-Liu commented May 19, 2016 •

edited

yujuhong commented May 20, 2016

fejta-bot commented Dec 26, 2017

yujuhong commented Dec 27, 2017

fejta-bot commented Mar 27, 2018

fejta-bot commented Apr 26, 2018

fejta-bot commented May 26, 2018

Evaluate the benefits of adopting docker event stream #16831

Evaluate the benefits of adopting docker event stream #16831

Comments

yujuhong commented Nov 5, 2015

Random-Liu commented Nov 5, 2015

vishh commented Nov 5, 2015

yujuhong commented Feb 12, 2016

Random-Liu commented Feb 12, 2016

MHBauer commented May 17, 2016

timothysc commented May 19, 2016

Random-Liu commented May 19, 2016 • edited

yujuhong commented May 20, 2016

fejta-bot commented Dec 26, 2017

yujuhong commented Dec 27, 2017

fejta-bot commented Mar 27, 2018

fejta-bot commented Apr 26, 2018

fejta-bot commented May 26, 2018

Random-Liu commented May 19, 2016 •

edited