Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate the benefits of adopting docker event stream #16831

Closed
yujuhong opened this issue Nov 5, 2015 · 13 comments
Closed

Evaluate the benefits of adopting docker event stream #16831

yujuhong opened this issue Nov 5, 2015 · 13 comments
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/backlog Higher priority than priority/awaiting-more-evidence. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scalability Categorizes an issue or PR as relevant to SIG Scalability.

Comments

@yujuhong
Copy link
Contributor

yujuhong commented Nov 5, 2015

In #12540, docker event stream was proposed as part of the pod lifecycle event generator (PLEG) to reduce number of docker operations.

However, even without adopting the docker event stream, we can still implement a PLEG solely by periodically relisting the containers. The generic PLEG in #13571 is an example.
This would still improve the average resource usage of kubelet and docker because only one goroutine queries docker at a higher frequency (as opposed to all pod workers). The drawback, on the other hand, is that kubelet's reaction time to container events would be equal or greater than the relist period. We should try to understand the limit of pure relisting by running some micro-benchmark using docker, so that we know how much benefits the container event stream can bring us.

Metrics:

  • latency
  • cpu usage percentiles

Parameters to vary:

  • op: list, inspect
  • period
  • number of containers
  • GCE machine size

The next step would be benchmarking the docker event stream.

/cc @kubernetes/goog-node

@yujuhong yujuhong added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/scalability Categorizes an issue or PR as relevant to SIG Scalability. sig/node Categorizes an issue or PR as relevant to SIG Node. labels Nov 5, 2015
@Random-Liu
Copy link
Member

OK, will do! This is indeed necessary for future work. :)

@dchen1107 dchen1107 added this to the v1.2-candidate milestone Nov 5, 2015
@vishh
Copy link
Contributor

vishh commented Nov 5, 2015

+1 for starting with the Generic PLEG.
One thing to consider is the resource overhead of periodic relisting on the
docker daemon.

On Wed, Nov 4, 2015 at 6:12 PM, taotao notifications@github.com wrote:

OK, will do! This is indeed necessary for future work. :)


Reply to this email directly or view it on GitHub
#16831 (comment)
.

@yujuhong
Copy link
Contributor Author

Now that we've added both a generic PLEG and a cache. We should re-evaluate the benefit of adopting the docker event stream and whether we should prioritize it.

@Random-Liu
Copy link
Member

@MHBauer
Copy link
Contributor

MHBauer commented May 17, 2016

I am familiar with the docker side of events. With a little guidance I can help do the work in the kube code.

@timothysc
Copy link
Member

/cc @rrati fyi to keep on the radar.

@Random-Liu
Copy link
Member

Random-Liu commented May 19, 2016

@MHBauer We are redefining the container runtime interface now #22964.
At least now, event is not in our short plan, because not all runtimes support event now, e.g. rkt. And according to our measurement, event stream does improve the performance, but the current polling based implementation is enough for now.

/cc @yujuhong

@yujuhong
Copy link
Contributor Author

@MHBauer that's great.

As @Random-Liu pointed out, event stream will improve performance and resource usage for kubelet, but since 1) not all runtimes support this and 2) the benefits significant enough, we didn't prioritize switching to the event stream.

The problem is a little bit more complicated than simply interfacing with the docker event stream. If you are interested, here is some background.

Kubelet has a Pod Lifecycle Event Generator, PLEG that is compatible with all runtimes. PLEG does basically two things:

  1. periodically polls docker (via docker ps -a) and compare the current container list with the last one. If there is any change, it generates an event.
  2. In addition to generating an event, it also does docker inspect calls and updates an in-memory pod status cache.

Kubelet has a per-pod goroutine (aka a pod worker) checking a pod periodically. Every time a worker syncs a pod, it gets the current status from the pod status cache to avoid hitting docker directly. Before the next sync can start, the worker will need to wait until all the side-effects (i.e., events) from the previous sync have been observed by PLEG and the in-memory cache has been updated accordingly. This is important because otherwise the worker may create the same container again! To enforce this rule, PLEG records a timestamp on each relisting, to indicate how fresh the cache is. The worker will block until the cache is newer than its last sync.

If we want to use switch to docker event stream today, we'll need to

  1. add a docker-specific PLEG.
  2. handle the working syncing problem.

(2) is required because if kubelet doesn't receive any events over a period of time, it'd not know if there are truly no events, or the events are simply delayed. This is not the case for docker ps -a with an absolute timestamp to show how fresh the cache is. This issue can be handled by recording expected events, which requires modifying a non-trivial amount of kubelet code.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 26, 2017
@yujuhong yujuhong added priority/backlog Higher priority than priority/awaiting-more-evidence. and removed priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 27, 2017
@yujuhong
Copy link
Contributor Author

This issue can be adapted to evaluate whether we need to support event stream in CRI. It is very low priority right now unless the need can be identified clearly.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 27, 2018
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Apr 26, 2018
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/backlog Higher priority than priority/awaiting-more-evidence. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scalability Categorizes an issue or PR as relevant to SIG Scalability.
Projects
None yet
Development

No branches or pull requests

9 participants