Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubelet taking large amounts of cpu in v0.19.3 #10451

Closed
robin-anil opened this issue Jun 27, 2015 · 48 comments · Fixed by #15612
Closed

Kubelet taking large amounts of cpu in v0.19.3 #10451

robin-anil opened this issue Jun 27, 2015 · 48 comments · Fixed by #15612
Assignees
Labels
sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@robin-anil
Copy link

Ever since we spun on a new cluster the cpu has been slowly climbing
The cloud monitoring console is showing this climb

This shows the cpu usage increase over the course of 2 days.
screen shot 2015-06-27 at 11 49 53 am

On sshing in to the node, I see that kubelet is taking most of the cpu.

 PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND                                                                                                                                                                                                                                                         
 3002 root      20   0  177m  85m  20m S  20.3  5.0 315:19.21 kubelet                                                                                                                                                                                                                                                         
 3765 root      20   0  938m  81m  12m S   9.7  4.8  53:15.97 docker                                                                                                                                                                                                                                                          
 9796 root      20   0  885m  96m 9476 S   1.0  5.6   1:11.19 nodejs                                                                                                                                                                                                                                                          
 3046 root      20   0 80148  16m  11m S   0.3  0.9   5:59.90 kube-proxy           

Let me know if you would like traces to investigate this. I can easily restart the kubelet to solve this issue, however I will wait for getting you some traces before I do so.

@roberthbailey roberthbailey added kind/support Categorizes issue or PR as a support question. sig/node Categorizes an issue or PR as relevant to SIG Node. labels Jun 27, 2015
@roberthbailey
Copy link
Contributor

@dchen1107 what traces should @r-tock collect before restarting the kubelet?

@robin-anil
Copy link
Author

is there a /profilez handler exposed in kubelet binaries ? I have gperftools installed on the nodes, I could generate a pprof output if there is such a thing

Here is what cadvisor shows, repeated spiky cpu usage pattern
cadvisor kubelet

@robin-anil
Copy link
Author

After snooping around a bit more I suspect the stats pushing as the cpu gulper. Using sysdig on one of the node I see that the core utilization spikes(10 seconds apart) coincides with the stats push.

sysdig -s128 -A -c stderr proc.name=kubelet
I0628 00:05:46.443448    3002 server.go:593] GET /stats/default/kube-dns-v3-aycfq/d27effd5-1b6e-11e5-9d9e-42010af09358/etcd: (12
I0628 00:05:46.473826    3002 server.go:593] GET /stats/default/home-page-zjcij/70e5621e-1b71-11e5-9d9e-42010af09358/home-page: 
I0628 00:05:46.545352    3002 server.go:593] GET /stats/default/fluentd-cloud-logging-gke-home-page-2015-06-25-b2dabe4a-node-q9v
I0628 00:05:46.688132    3002 server.go:593] GET /stats/default/kube-dns-v3-aycfq/d27effd5-1b6e-11e5-9d9e-42010af09358/kube2sky:
I0628 00:05:46.767916    3002 server.go:593] GET /stats/default/kube-dns-v3-aycfq/d27effd5-1b6e-11e5-9d9e-42010af09358/skydns: (
I0628 00:05:46.872797    3002 server.go:593] POST /stats/container/: (393.873037ms) 0 [[Go 1.1 package http] 10.228.1.3:43979]

I0628 00:05:57.045120    3002 server.go:593] GET /stats/default/home-page-zjcij/70e5621e-1b71-11e5-9d9e-42010af09358/home-page: 
I0628 00:05:57.094343    3002 server.go:593] GET /stats/default/kube-dns-v3-aycfq/d27effd5-1b6e-11e5-9d9e-42010af09358/etcd: (62
I0628 00:05:57.094838    3002 server.go:593] GET /stats/default/fluentd-cloud-logging-gke-home-page-2015-06-25-b2dabe4a-node-q9v
I0628 00:05:57.220728    3002 server.go:593] GET /stats/default/kube-dns-v3-aycfq/d27effd5-1b6e-11e5-9d9e-42010af09358/kube2sky:
I0628 00:05:57.273426    3002 server.go:593] GET /stats/default/kube-dns-v3-aycfq/d27effd5-1b6e-11e5-9d9e-42010af09358/skydns: (
I0628 00:05:57.376991    3002 server.go:593] POST /stats/container/: (256.758611ms) 0 [[Go 1.1 package http] 10.228.1.3:43991]

I0628 00:06:07.700662    3002 server.go:593] GET /stats/default/home-page-zjcij/70e5621e-1b71-11e5-9d9e-42010af09358/home-page: 
I0628 00:06:07.701156    3002 server.go:593] GET /stats/default/kube-dns-v3-aycfq/d27effd5-1b6e-11e5-9d9e-42010af09358/etcd: (34
I0628 00:06:07.724733    3002 server.go:593] GET /stats/default/fluentd-cloud-logging-gke-home-page-2015-06-25-b2dabe4a-node-q9v
I0628 00:06:07.864174    3002 server.go:593] GET /stats/default/kube-dns-v3-aycfq/d27effd5-1b6e-11e5-9d9e-42010af09358/kube2sky:
I0628 00:06:07.936568    3002 server.go:593] GET /stats/default/kube-dns-v3-aycfq/d27effd5-1b6e-11e5-9d9e-42010af09358/skydns: (
I0628 00:06:07.994559    3002 server.go:593] POST /stats/container/: (307.295863ms) 0 [[Go 1.1 package http] 10.228.1.3:44005]

I0628 00:06:18.159048    3002 server.go:593] GET /stats/default/home-page-zjcij/70e5621e-1b71-11e5-9d9e-42010af09358/home-page: 
I0628 00:06:18.200784    3002 server.go:593] GET /stats/default/fluentd-cloud-logging-gke-home-page-2015-06-25-b2dabe4a-node-q9v
I0628 00:06:18.217562    3002 server.go:593] GET /stats/default/kube-dns-v3-aycfq/d27effd5-1b6e-11e5-9d9e-42010af09358/etcd: (57
I0628 00:06:18.340630    3002 server.go:593] GET /stats/default/kube-dns-v3-aycfq/d27effd5-1b6e-11e5-9d9e-42010af09358/kube2sky:
I0628 00:06:18.408420    3002 server.go:593] GET /stats/default/kube-dns-v3-aycfq/d27effd5-1b6e-11e5-9d9e-42010af09358/skydns: (
I0628 00:06:18.449176    3002 server.go:593] POST /stats/container/: (279.48505ms) 0 [[Go 1.1 package http] 10.228.1.3:44018]

I restarted kubelet in 4 other nodes and cpu has been coming down. Without looking at the code this feels like some piece of data is getting accumulated and it is getting costlier and costlier to push/process that data over time as the server continues to accumulate more of it.

screen shot 2015-06-27 at 8 19 08 pm

@roberthbailey
Copy link
Contributor

@vmarmol @vishh as well (since this may be related to cAdvisor).

@robin-anil
Copy link
Author

I still have the one misbehaving node. Let me know till how long I should keep them around.

@roberthbailey
Copy link
Contributor

is there a /profilez handler exposed in kubelet binaries?

pprof is enabled on port 10250. Try /debug/pprof/profile (see
https://github.com/GoogleCloudPlatform/kubernetes/blob/master/pkg/kubelet/server.go#L153).

@robin-anil
Copy link
Author

No success using go 1.4.2 toolchain

# go tool pprof http://localhost:10250/debug/pprof/profile
Fetching profile from http://localhost:10250/debug/pprof/profile
Please wait... (30s)
http fetch http://localhost:10250/debug/pprof/profile: Get http://localhost:10250/debug/pprof/profile: malformed HTTP response "\x15\x03\x01\x00\x02\x02"

@robin-anil
Copy link
Author

worked on port 10248

Here is the profile I got

kubelet1

@roberthbailey
Copy link
Contributor

Port 10250 is https, not http.

@robin-anil
Copy link
Author

That gives an error which to me sounds like the server is not returning the full certificate chain...

 go tool pprof https://gke-home-page-2015-06-25-b2dabe4a-node-q9vq:10250/debug/pprof/profile
Fetching profile from https://gke-home-page-2015-06-25-b2dabe4a-node-q9vq:10250/debug/pprof/profile
Please wait... (30s)
http fetch https://gke-home-page-2015-06-25-b2dabe4a-node-q9vq:10250/debug/pprof/profile: Get https://gke-home-page-2015-06-25-b2dabe4a-node-q9vq:10250/debug/pprof/profile: x509: certificate signed by unknown authority

screen shot 2015-06-28 at 12 50 40 am

@roberthbailey
Copy link
Contributor

Right, it's using a self signed cert. I'm not sure if it's possible to convince go tool pprof to ignore certificates.

@robin-anil
Copy link
Author

Does the other profile on port 10248 help?

@vmarmol
Copy link
Contributor

vmarmol commented Jun 28, 2015

@r-tock can you take a profile of Kubelets with and without the issue? Especially the ones you restarted. The CPU usage you mentioned seems to be coming from processing stats requests, I'm curious if when you restarted the requests temporarily ceased.

How many pods/containers are being run on the machine btw? 25% usage is not out of line with our expected numbers for a more heavily used machine.

@robin-anil
Copy link
Author

Here is the same node (which GCE monitoring says now uses 80% cpu).

kubelet2

Most of the other nodes have regressed back and using more cpu now.

screen shot 2015-06-28 at 7 32 33 pm

@robin-anil
Copy link
Author

Also all these nodes have been sending data at a sustained rate of 250KB/s. Maybe its something with these nodes?

screen shot 2015-06-28 at 8 02 08 pm

@robin-anil
Copy link
Author

@vmarmol To answer your last question, we run one container on these nodes (apart from the standard gke containers)

@roberthbailey
Copy link
Contributor

@r-tock -- I was chatting with @bradfitz last night and he said that it isn't currently possible to use go tool pprof against an https endpoint that uses a self signed certificate. I've filed golang/go#11468 to try and get support added to a future release of go tools.

@robin-anil
Copy link
Author

@roberthbailey ack.

The top node is currently running hot at 85% cpu usage. Let me know if there is anything I can help here on this bug.

@dchen1107
Copy link
Member

This is known issue because heapster collects /stats too frequently. PR #9809 was merged for v0.20 release should mitigate the issue.

@dchen1107
Copy link
Member

cc/ @yujuhong since she was measuring kubelet and other daemons' resource usages based on the workload a while back.

@yujuhong
Copy link
Contributor

cc/ @yujuhong since she was measuring kubelet and other daemons' resource usages based on the workload a while back.

I took some measurements a while ago, but I took them with heapster disabled. Also, the codebase has changed quite a lot since then. Overall, the cpu usage of kubelet and docker in the steady state (no pod addition/deletion) roughly increases with the number of pods until it hit 30 pods on n1-standard-4 node (after that, docker became the rate limiter). Any readiness check (live probe) would also increase the kubelet cpu usage.

I just created a cluster with HEAD using two n1-standard-1 nodes and will monitor the cpu usage overnight. So far the kubelet usage (reported by cadvisor) for the two nodes are 3%-14% (spike) and 1.5%-8%, respectively. The difference of the cpu usage is caused by the distribution of add-on pods. The usage is still low compared with what was reported above. @r-tock's figure showed that the minimum cpu usage for kubelet is ~15%, excluding the spikes.

@r-tock, here are some questions that I hope would help clarify the situation:

  1. What pods are you running on the node? Could you run kubectl describe node <node_name>?
  2. What machine type do you use for the nodes?
  3. On the node with ~85% cpu usage, what are docker's and kubelet's cpu usage, respectively?

By the way, in addition to the the /stats requests, kubelet also syncs all pods every 10 seconds. The syncing usually causes a spike in cpu usage for both kubelet and docker. kubelet.log does not show the syncing messages if the log level is <4.

@robin-anil
Copy link
Author

Answers

kubectl describe node gke-home-page-2015-06-25-b2dabe4a-node-q9vq
W0629 22:25:19.358901   55100 request.go:291] field selector: v1beta3 - events - involvedObject.kind - Node: need to check if this is versioned correctly.
W0629 22:25:19.359233   55100 request.go:291] field selector: v1beta3 - events - involvedObject.uid - gke-home-page-2015-06-25-b2dabe4a-node-q9vq: need to check if this is versioned correctly.
W0629 22:25:19.359241   55100 request.go:291] field selector: v1beta3 - events - involvedObject.name - gke-home-page-2015-06-25-b2dabe4a-node-q9vq: need to check if this is versioned correctly.
W0629 22:25:19.359247   55100 request.go:291] field selector: v1beta3 - events - involvedObject.namespace - : need to check if this is versioned correctly.
Name:           gke-home-page-2015-06-25-b2dabe4a-node-q9vq
Labels:         kubernetes.io/hostname=gke-home-page-2015-06-25-b2dabe4a-node-q9vq
CreationTimestamp:  Thu, 25 Jun 2015 14:17:24 -0500
Conditions:
  Type      Status  LastHeartbeatTime           LastTransitionTime          Reason                  Message
  Ready     True    Mon, 29 Jun 2015 22:24:31 -0500     Thu, 25 Jun 2015 14:17:34 -0500     kubelet is posting ready status     
Addresses:  10.240.47.135,107.178.217.212
Capacity:
 cpu:       1
 memory:    1745136Ki
 pods:      100
Version:
 Kernel Version:        3.16.0-0.bpo.4-amd64
 OS Image:          Debian GNU/Linux 7 (wheezy)
 Container Runtime Version: docker://1.6.2
 Kubelet Version:       v0.19.3
 Kube-Proxy Version:        v0.19.3
PodCIDR:            10.228.0.0/24
ExternalID:         3609693451404587675
Pods:               (3 in total)
  fluentd-cloud-logging-gke-home-page-2015-06-25-b2dabe4a-node-q9vq
  home-page-zjcij
  kube-dns-v3-aycfq
No events.
  1. This particular one is a g1-small. The others that had the issue were n1-standard-1
  2. docker rarely showed up in top. So I am going to attribute the most of the CPU to kubelet.

@dchen1107
Copy link
Member

@r-tock I believe you are running GKE, which means you are at v0.19 kubernetes release. For v0.19 release, it is known issue that heapster collects /stats too frequently, which contributes largely to kubelet cpu spike.

@robin-anil
Copy link
Author

/ack. What is the eta on the 0.20.1 release?

@yujuhong
Copy link
Contributor

Some update about kubelet's resource usage after letting the cluster run for >16 hours. As mentioned before, this is a lightly-loaded cluster with primarily add-on pods with an extra pause pod per node

  • A slight increase in cpu usage (2~3%)
  • A slight increase in memory usage (from 4x MB to 5x MB)

The cloud console reports a stable 11% cpu usage for the node, with steady incoming network traffic ~24xKB.

I don't see any dramatic increase in resource usage, but I will keep monitoring for a while.

@dchen1107
Copy link
Member

@yujuhong Thanks for the measurement here. Please add more loads to the kubelet and measure the cpuusage of kubelet, docker and kube-proxy from HEAD. Thanks!

Also #10420

@yujuhong
Copy link
Contributor

FYI, the cpu profile:

test

@brendandburns
Copy link
Contributor

Please also see:

#10659

@robin-anil
Copy link
Author

What is the ETA of pushing this to GKE? We still have all our alerts flying and oncall restarting kubelet twice every day and would like to be in a calmer state.

@brendandburns
Copy link
Contributor

We are in the process of pushing 0.21 to GKE today and tomorrow for new clusters. We will then begin the process of upgrading existing clusters (likely Monday)

If you would like to be one of the initial test upgrades (possibly even earlier than Monday) send me mail at bburns [at] google.com.

Also, in the meantime, a work around would be to disable the health check on the DNS pod. Basically

kubectl get rc kube-dns -o json

Edit the json to remove the health check

kubectl update rc kube-dns -f <json-file>
kubectl delete pods kube-dns-<hash-name>

and see if that fixes things.

@robin-anil
Copy link
Author

Thanks @brendandburns we will take care of deploying the world (our world) tomorrow as soon as the release hits.

Will try the health check fix

@robin-anil
Copy link
Author

@brendandburns updating kube-dns by removing the livenessProbe did not reduce the cpu, I had to manually restart kubelet.

@robin-anil
Copy link
Author

The cpu usage has been down since the last release. But for g1-small nodes its hovering around 23%, I would keep the investigation around to reduce that further.

Also the bandwidth usage from these nodes have been a steady 256KB/s

@goltermann goltermann removed the kind/support Categorizes issue or PR as a support question. label Aug 25, 2015
@PhilibertDugas
Copy link
Contributor

I'm seeing a similar issue here on a bare-metal Ubuntu 14.04 installation,

Kubelet takes high memory/cpu:

image

Memory ~ 4GB
image

CPU seems mostly used by Kernel
image

@thockin
Copy link
Member

thockin commented Sep 24, 2015

Is this a node that is running DNS? There's a "bug" in older dockers that
caused kubelet CPU usage to go up and up over time, and it you're on 0.19
you probably have the bug

On Thu, Sep 24, 2015 at 8:26 AM, Philibert Dugas notifications@github.com
wrote:

I'm seeing a similar issue here on a bare-metal Ubuntu 14.04 installation,

Kubelet takes high memory/cpu:

[image: image]
https://cloud.githubusercontent.com/assets/5460957/10077535/a74c4c4a-62ae-11e5-9ff6-4a48563f81d0.png

Memory ~ 4GB
[image: image]
https://cloud.githubusercontent.com/assets/5460957/10077521/95f99970-62ae-11e5-986b-855a9981f91e.png

CPU seems mostly used by Kernel
[image: image]
https://cloud.githubusercontent.com/assets/5460957/10077557/c11b91c6-62ae-11e5-9a05-46772db0d9c3.png


Reply to this email directly or view it on GitHub
#10451 (comment)
.

@PhilibertDugas
Copy link
Contributor

Sorry for not writing those info!

It's running 1.0.1 and not running the DNS.

The node itself is running 8 pods. 1 being a redis pod, 1 being fluentd pushing to elastic search

The other pods are java applications

@yujuhong
Copy link
Contributor

@PhilibertDugas, I'd like to know more about your cluster. Thanks!

  • Was the cluster in a relative steady state (no constant pod add/delete/update)?
  • Do you have the default monitoring service enabled so that you can track cpu usage of kubelet and docker separately? (e.g. kubectl proxy and then open http://localhost:8001/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana).
  • Do you have multiple nodes or just one node (incl. the master node)?
  • Do the pods have any liveness/readiness check defined? If so, what type are they?

@PhilibertDugas
Copy link
Contributor

@yujuhong

  • The cluster has no add/delete/update actions on it, only usage of the 8 pods on it
  • There is no monitoring service enabled such as grafana, is there another way to track kubelet vs docker usage?
  • This is a cluster containing 1 master node + 6 minion nodes. All of them running on Ubuntu 14.04
  • There is no livenessProbe or readinessProbe defined for our pods

From our monitoring system (external of the cluster) whenever we restart the kubelet, CPU of the whole Ubuntu machine goes down (the huge drops in the graph are a kubelet restart). Gradually the CPU of the machine goes back up. Disregard the "Pod 4" label as it's a graph for the whole Ubuntu Server :)
kubelet-leak

@yujuhong
Copy link
Contributor

@PhilibertDugas, thanks for the information and the graph. It'd be great if you can get the cpu profile on a node where kubelet has high cpu usage.

You can get a 30-second cpu profile by running kubectl proxy and then
go tool pprof http://localhost:8001/api/v1/proxy/nodes/<node_name>/debug/pprof/profile

@PhilibertDugas
Copy link
Contributor

@yujuhong

Here is the cpu profile:

First part
kubelet-leak-1

Second Part
kubelet-leak-5

I tried it on other node and it looked completely different! Something looks way off with this one

@vishh
Copy link
Contributor

vishh commented Sep 25, 2015

IIUC, it looks like cadvisor's du calls are proving to be expensive.

On Fri, Sep 25, 2015 at 8:41 AM, Philibert Dugas notifications@github.com
wrote:

@yujuhong https://github.com/yujuhong

Here is the cpu profile:

First part
[image: kubelet-leak-1]
https://cloud.githubusercontent.com/assets/5460957/10104919/2ba51dac-637a-11e5-9f0d-b4138ffe9547.JPG

Second Part
[image: kubelet-leak-5]
https://cloud.githubusercontent.com/assets/5460957/10104922/2e800834-637a-11e5-9c63-bf8b6144349d.JPG

I tried it on other node and it looked completely different! Something
looks way off with this one


Reply to this email directly or view it on GitHub
#10451 (comment)
.

@yujuhong
Copy link
Contributor

I tried it on other node and it looked completely different! Something looks way off with this one

@PhilibertDugas, thanks a lot for the profile! Does other nodes exhibit high cpu usage as well?

IIUC, it looks like cadvisor's du calls are proving to be expensive.

It certainly looks like that's the case here. If the cadvisor hasn't changed that much since kubernetes v1.0.1, this call is checking the aufs directory for the container. @PhilibertDugas, perhaps you can manually check if du -s is slow in /var/lib/docker/aufs? (e.g., http://serverfault.com/questions/563857/why-is-du-causing-high-cpu-usage)

@PhilibertDugas
Copy link
Contributor

@yujuhong

The other nodes with different graphs do not exhibit any high cpu usage,

I ran the du -s command as asked

/var/lib/docker/aufs# time du -s
11992780        .

real    1m2.628s
user    0m1.007s
sys     0m7.561s

Took around 1 minute,

If I run it again,

/var/lib/docker/aufs# time du -s
11992780        .

real    0m2.277s
user    0m0.350s
sys     0m1.869s

Seem much quicker, but that might be due to caching

@yujuhong
Copy link
Contributor

@PhilibertDugas, yes, the second attempt is probably due to disk buffering. cadvisor does du -s pretty frequently (1s?) for each container though, so if it's taking long, that'd definitely be a problem.

Is there anything i/o heavy running on the node or is there any difference in the machine spec for this node?

@aanm
Copy link
Contributor

aanm commented Sep 28, 2015

I'm also getting kubelet with high CPU usage. I'm using the vagrant VMs and after a while the VM kills/freeze the ssh connection to the point I'm unable to reach the VM again.

@PhilibertDugas
Copy link
Contributor

@yujuhong , from what I see, there is nothing that seems anormal,

The machine has 8 cores,

I/O is ~15 transfer/s, ~15 write/s

I compared the stats with another machine that is running the same configuration (Kubernetes 1.0.1, etc) which doesn't have high cpu issues and only 1 metric seemed different.

The context switching for the High CPU node was ~ 15K context switch/s while the other machine with lower cpu is hovering around 6-7K context switch/s

I will try and see if there's anything differing between those two nodes!

@yujuhong
Copy link
Contributor

@PhilibertDugas, your directory could just be deeper and takes longer time to run du. We discussed the issue today and @vishh suggested that we increase the checking period of cadvisor, which should alleviate the problem. I've filed a cadvisor issue for this: google/cadvisor#898

@PhilibertDugas
Copy link
Contributor

@yujuhong Thanks for this !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

Successfully merging a pull request may close this issue.