Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should we use Docker events? #8756

Closed
justinsb opened this issue May 24, 2015 · 13 comments
Closed

Should we use Docker events? #8756

justinsb opened this issue May 24, 2015 · 13 comments
Labels
priority/backlog Higher priority than priority/awaiting-more-evidence. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@justinsb
Copy link
Member

Docker events give us a streaming HTTP feed of events that happen on the Docker daemon:
http://gliderlabs.com/blog/2015/04/14/docker-events-explained/

I think we're currently just polling the Docker API.

If we used events, I think we could

  1. Reduce the delay after a container stop event to where it is reported to k8s. (This is how I noticed this)
  2. Reduce the interval with which we poll Docker, thereby saving a lot of JSON work talking to Docker.
@justinsb
Copy link
Member Author

(What I observed was that e.g. the e2e test for tmpfs creates a one-off, short-lived Docker container. That runs to completion within a second or two. But we have to wait ~15 seconds for kubelet to notice that it has finished, and post the completion status into k8s)

@justinsb
Copy link
Member Author

The reason I think this is important is that it makes k8s look bad; I thought that k8s was slow to start up a container, but in fact k8s was just slow to tell me the container had started. But I guess this also slows down things dependent on the API, for example setting up a load-balancer to target those containers.

One maybe-for-V1 option would be to do a targeted poll on certain "important" events (container stop/start).

@lavalamp lavalamp added priority/backlog Higher priority than priority/awaiting-more-evidence. sig/node Categorizes an issue or PR as relevant to SIG Node. labels May 24, 2015
@lavalamp lavalamp added this to the v1.0-candidate milestone May 24, 2015
@lavalamp
Copy link
Member

v1.0 candidate because our goal is < 5 seconds between creation & notify running (assuming images already pulled).

@lavalamp
Copy link
Member

@jimmidyson
Copy link
Member

Also see #8620: docker daemon gets hammered with request even on small installations. Assume moving to docker events will reduce resource consumption too.

@wojtek-t
Copy link
Member

Thanks for reporting it. I think that generally what you suggested is a good approach. However, my feeling is that it's (hopefully) not needed to meet our v1.0 goals (i.e. 5 seconds is long enough).

@davidopp
Copy link
Member

+1 to what @wojtek-t said. We should only do this now if it's necessary to meet our 1.0 performance goals.

@goltermann goltermann modified the milestones: v1.0-post, v1.0-candidate May 26, 2015
@yujuhong
Copy link
Contributor

In general, I agree that using docker events would reduce kubelet's pod managing overhead quite a bit, as it currently relies heavily on listing/inspecting container periodically, or triggered by updates.

I thought that k8s was slow to start up a container, but in fact k8s was just slow to tell me the container had started

I think there are two different things here. Reporting that a container has terminated can be subject to how often kubelet polls, so using docker events would help. Starting the containers is a different story though.. Kubelet watches the pods spec changes so it should react timely (i.e., triggering a pod sync right away). At the end of each sync, kubelet generates the new pod status and sends the update. The polling period is not on the critical path here. However, if you have many pods, kubelet may send the pod status updates to the apiserver slowly over time because of the QPS limit.

For v1.0 goals which focuses on reporting "running" pods, kubelet should be fine.

@bgrant0607 bgrant0607 removed this from the v1.0-post milestone Jul 24, 2015
@fotinakis
Copy link

In addition to the delays and the 4-5% min CPU usage [link], polling also seems to add a tremendous amount of data to the host docker daemon log /var/log/docker.log and, scarily, consume an equivalent amount of memory.

I recently truncated docker.log after it grew to 3.7GB and that surprisingly also freed 4GB of memory (!). I'm not sure what process is holding on to this data, whether it's the docker daemon itself, kubelet, or a logging daemon, but I can help track it down if needed. Would love any guesses if someone knows why that happens. (This is on GCE container-vms, which may not be rotating that log file—discussion here).

Docker events 👍 polling 👎

@jimmidyson
Copy link
Member

@fotinakis Re: the memory problem. There is a bug in Docker 1.6 - see moby/moby#9139. Fixed in >= 1.7. Not sure if GCE cotainer-vms run vanilla Docker or a custom build with backported fixes though.

@fotinakis
Copy link

Cool, thanks @jimmidyson. I think container-vms is still running 1.6 so that makes sense.

@yujuhong
Copy link
Contributor

We've changed the way kubelet polls docker significantly for v1.2, which should help with the delay and the CPU usage. #16831 was created to evaluate the benefits of adopting the docker event stream, and whether we should prioritize it.

@yujuhong
Copy link
Contributor

yujuhong commented Jun 2, 2017

Closing in favor of #16831

@yujuhong yujuhong closed this as completed Jun 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/backlog Higher priority than priority/awaiting-more-evidence. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

No branches or pull requests

9 participants