Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identify, document and maybe remove kubelet dependencies #26093

Closed
luxas opened this issue May 23, 2016 · 75 comments
Closed

Identify, document and maybe remove kubelet dependencies #26093

luxas opened this issue May 23, 2016 · 75 comments
Assignees
Labels
area/kubelet area/system-requirement help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/documentation Categorizes issue or PR as related to documentation. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@luxas
Copy link
Member

luxas commented May 23, 2016

Kubelet has a few dependencies, and we should be aware of them, document them so users know what's the minimum amount of other programs kubelet needs around it to execute successfully.

I'm testing this by compiling kubelet statically and running it in a busybox container with absolutely nothing. I've been surprised how well it runs after all.

I'm listing some, both hard and soft dependencies:

  • glibc: hard dependency on the right libraries at runtime. We should get rid of this, as it's unnecessary: Make all Kubernetes binaries statically linked #26028
  • root access: hard dependency and unavoidable. The kubelet has to have full control of the machine (CAP_SYS_ADMIN is one of the privs that are needed, at least when running in a container)
  • cgroups
    • cpu and memory: required and used in the meantime
    • cpuacct and cpuset: required but not used (but @vishh said they will be used in the future)
  • journalctl: bug in cAdvisor, should not be required
  • iptables: Is required for the kubelet it seems: Do not call NewFlannelServer() unless flannel overlay is enabled #26264, although now flannel isn't initialized when not used.
  • ethtool: Required for hairpin code. I do not really understand what hairpin is required/used for... @thockin @bprashanth
  • nsenter: Used as util/io/writer when --containerized. Used by --docker-exec-handler, but not default. Used with socat for port forwarding within a pod.
  • mount, umount and findmnt: Used in NsenterWriter in util/io/writer
  • socat: Used for port forwarding within a pod. code
  • touch: Used by rkt code
  • docker or rkt obviously
  • systemd? Does rkt require a systemd host? @yifan-gu
  • Paths:
    • /:ro it must be able to read from the whole host
    • /var/run/:rw required for communicating with docker, setting up certs etc.
    • /sys/:rw for various cgroup things
    • /var/lib/kubelet:rw and /var/lib/docker:rw it obviously must be able to write to it's own and docker's dir

for kube-proxy:

  • iptables: is of course a hard requirement. But the problem is that (as it seems) iptables has to be linked dynamically, which sets a glibc dep on the kube-proxy image.
  • conntrack: used for limiting requests (I think)

inside client containers:

  • env: Seems like it's required inside a container that Kubernetes is running, but only when using --docker-exec-handler=native code

volume plugins:

  • dmidecode: required for the vsphere plugin
  • glusterfs-server, glusterfs-client are required for the glusterfs plugin
  • git is required for the git plugin

Please correct me if I'm wrong, and comment if there is anything I've missed.
Would appreciate discussion about this (and we should have multi-platform Kubernetes in mind)

cc @vishh @smarterclayton @mikedanese @pmorie

@vishh
Copy link
Contributor

vishh commented May 23, 2016

@pwittrock What is the best way to document these requirements moving forward?

@vishh vishh added team/ux kind/documentation Categorizes issue or PR as related to documentation. labels May 23, 2016
@pmorie
Copy link
Member

pmorie commented May 23, 2016

@luxas Classy digging, and this is super-useful to have. +100

We should decide at what level we want to document transitive deps. For example, cadvisor is going to begin calling thin_ls to monitor containers backed by devicemapper thin pools. Since disk eviction is going to use this, I would want to know about that specific dep, though it is not strictly a dependency of the kubelet itself.

@pwittrock
Copy link
Member

@vishh I think adding a page to the "Administering Clusters" at http://kubernetes.io/docs/ would be the best place

@pwittrock pwittrock added the priority/backlog Higher priority than priority/awaiting-more-evidence. label May 24, 2016
@luxas
Copy link
Member Author

luxas commented May 24, 2016

@pmorie Thanks :) It took some time to dig out.

When we document it, I think we should present them in groups, e.g. always required, required with option --xxx, required when using yyy as the storage driver, and nice-to-haves.

But we should also get rid of some:

k8s-github-robot pushed a commit that referenced this issue May 29, 2016
Automatic merge from submit-queue

Do not call NewFlannelServer() unless flannel overlay is enabled

Ref: #26093 

This makes so kubelet does not warn the user that iptables isn't in PATH, although the user didn't enable the flannel overlay.

@vishh @freehan @bprashanth
@luxas
Copy link
Member Author

luxas commented Jun 3, 2016

Some things I've learned so far (will update the list)

@pmorie
Copy link
Member

pmorie commented Jun 3, 2016

There's a dependency on dmidecode in the vSphere volume plugin.

@luxas
Copy link
Member Author

luxas commented Jun 3, 2016

@pmorie Added. Do you remember any others right now?

@pmorie
Copy link
Member

pmorie commented Jun 3, 2016

@luxas Nope, I just saw that one during a code review.

@timstclair
Copy link

In addition to documenting these, I think it would be useful to check for all of them on kubelet startup. Hard dependencies should log fatal errors, and soft dependencies could log warnings.

@detiber
Copy link
Member

detiber commented Jul 21, 2016

nfs-utils (at least on Fedora/Red Hat/CentOS based systems) is needed by the nfs volume plugin.

@detiber
Copy link
Member

detiber commented Jul 21, 2016

ceph-common (at least on Fedora/Red Hat/CentOS based systems) is needed for the ceph volume plugin

@detiber
Copy link
Member

detiber commented Jul 21, 2016

glusterfs-fuse should be all that is needed for using the gluster storage plugin (at least on Fedora/Red Hat/CentOS based systems).

@detiber
Copy link
Member

detiber commented Jul 21, 2016

iscsi-initiator-utils (Fedora/Red Hat/CentOS based systems) is needed by the iscsi volume plugin

@detiber
Copy link
Member

detiber commented Jul 21, 2016

@timstclair I'm not sure you would want a fatal error if a storage plugin dependency isn't available, I would expect them to just not be loaded (and an obvious event and log message generated) if that is the case.

@luxas
Copy link
Member Author

luxas commented Jul 22, 2016

Thanks @detiber! (And for anyone out there that knows even more deps of the kubelet, comment!)

I think @timstclair meant that all storage drivers are soft deps, so kubelet would log a warning.

@philips
Copy link
Contributor

philips commented Jul 26, 2016

Is this part of NodeSpec? I am still unclear what kubernetes/enhancements#56 is trying to describe.

cc @vishh @dchen1107 @yujuhong @aronchick @aaronlevy

@luxas
Copy link
Member Author

luxas commented Aug 2, 2016

Just noticed that pkg/kubelet/container_bridge.go has a lot of dependencies on binaries in $PATH.

  • brctl
  • ip
  • pkill
  • (service|systemd|docker-babysitter)

@luxas
Copy link
Member Author

luxas commented Aug 3, 2016

Summarizing the PATH requirements for kubelet (oh my!):

  • pidof in pkg/kubelet/cm/container_manager_linux.go
  • uname -n in pkg/util/node/node.go
  • mount and umount in pkg/util/mount/mount_linux.go
  • lsblk, fsck, mkfs in pkg/util/mount/mount_linux.go
  • nsenter in a lot of places!
  • ethtool in pkg/kubelet/network/hairpin/hairpin.go
  • socat in pkg/kubelet/dockertools/docker_manager.go and pkg/kubelet/rkt/rkt.go
  • touch in pkg/kubelet/rkt/rkt.go
  • env (inside the container) in pkg/kubelet/dockertools/exec.go
  • iptables, iptables-restore, iptables-save in pkg/util/iptables/iptables.go and many other places
  • brctl in pkg/kubelet/container_bridge.go
  • pkill in pkg/kubelet/container_bridge.go
  • ip in pkg/kubelet/container_bridge.go, pkg/kubelet/network/plugins.go, pkg/kubelet/rkt/rkt.go
  • modprobe in pkg/kubelet/network/plugins.go
  • tc in pkg/util/bandwidth/linux.go
  • journalctl in vendor/github.com/google/cadvisor/utils/oomparser/oomparser.go, but it's not a hard requirement
  • thin_ls in vendor/github.com/google/cadvisor/devicemapper/thin_ls_client.go for the devicemapper storage driver
  • nice, du, dmsetup in vendor/github.com/google/cadvisor/fs/fs.go, are these important?

Hard requirements (kubelet should fail without these):

  • root access (is there an easy way to check for CAP_SYS_ADMIN?)
  • cpu, cpuset, cpuacct, memory cgroups mounted
  • iptables, ip must be present for kubelet networking
  • nsenter must be present for kubelet container management
  • mount and umount is required for nearly all types of volumes
  • syscall.Mount() is a wrapper for mount(2), but if we strictly need mount(8), that's a hard dependency @kubernetes/sig-storage
  • the br-netfilter module. kubelet should try to enable it on linux >= 3.18 with modprobe, and if it isn't enabled earlier or if modprobe doesn't exist, kubelet should fail (or should it?)
  • write access to /sys, /var/lib/docker and /var/lib/kubelet

I'm not sure if these should be hard or soft deps:

  • socat might be replaced by the gRPC protocol when Exec-ing, but still it's used for port-forwarding

How to remove some dependencies (before v1.4):

  • uname -n should be replaced with os.Hostname
  • brctl should be replaced by calls to the netlink go package
  • pkill with syscall.Kill
  • pidof, if we think it's worth it, we can do this
  • touch can be replaced with os.Chtimes link
  • ethtool used for hairpin, but ip link might be used instead

A dedicated pkg/util/nsenter package should be made, now it's called in ~10 different places in 10 different ways.

I can start to implement a pkg/kubelet/validator package that checks for these dependencies.

WDYT?
@kubernetes/sig-cluster-lifecycle @kubernetes/sig-node @kubernetes/sig-network

@jsafrane
Copy link
Member

jsafrane commented Aug 3, 2016

  • mount and umount is required for nearly all types of volumes
  • syscall.Mount() is a wrapper for mount(2), but if we strictly need mount(8), that's a hard dependency @kubernetes/sig-storage

Not only we need /bin/mount, we need all mount utilities like mount.nfs, mount.ceph and mount.glusterfs. Simple mount(2) syscall is not enough.

@tallclair
Copy link
Member

tallclair commented Jul 9, 2018

I was thinking about this a bit over the weekend, and have an idea I'd like to float before writing a more formal proposal. It's not a complete solution to this problem, but should help:

  1. For every command that is exec'd, introduce a new package for that command (e.g. k8s.io/command/git)
    i. In that package, there is a dedicated object for building out the command arguments, and then exec'ing the command.
    - Commands like git can have sub-commands (e.g. GitClone and GitCheckout)
    - Commands like nsenter can accept other commands to wrap (there would be a standard interface for commands)
    - Commands like mount can have sub-packages (e.g. mount/nfs)
    ii. The command has a setter for each argument, the setter MUST strongly validate the input. This is to address security issues where we're not sure whether the input has already been validated elsewhere in the stack. E.g. func SetPath(path string) error { ... }
  2. Forbid raw usage of the os/exec library in Kubernetes code (once everything is migrated to the command package, or forbid new usage).
    i. Introduce a whitelist mechanism for vendored dependencies that use commands, requiring an audit of the command usage, but generally strongly discouraging dependencies that issue raw commands.

What does this get us?

  • By making the dependencies part of the import path, it becomes much easier to audit their usage.*
  • By centralizing the actual usage of os/exec, it means we can centrally require approval for new dependencies, and require security review as necessary.
  • By centralizing the command code, we can ensure that every usage has fully validated arguments, for security

WDYT? Is it worth pursuing this idea further?

* It even enables things like a 'registration' mechanism, so binaries can report their dependencies and check them at startup (differentiating required & optional deps).

@rphillips
Copy link
Member

@tallclair This proposal is extremely valuable. Being able to audit what tools are shelled out to helps with security, packaging, and maintenance. We could ship an image or package of standard 'shell utilities', be able to upgrade them and audit them for security vulnerabilities when needed.

Couple thoughts:

  1. There should be a mechanism of extracting the utility's version... Semver might be needed if a utility is upgraded past a certain point.
  2. Depending on the calling process logger's verbosity, the stdout and stderr from the utility process should be propagated back to the calling process's logger for easier debugging.

@PatrickLang
Copy link
Contributor

+1 here. This will make it easier to make sure all OS or distro-dependent exec calls are audited and sanitized.

@luxas
Copy link
Member Author

luxas commented Aug 4, 2018

/assign @tallclair
😄

@dims
Copy link
Member

dims commented Feb 16, 2019

cc @andrewrynhard

@invidian
Copy link
Member

I'm not sure if that's the right place to drop that (perhaps new issue should be created?), but for people running kubelet as a container while deploying Kubernetes, it would be super useful to have a list of capabilities, host directories etc where kubelet needs to have access on host in order to function properly.

For my project, I run kubelet as a Docker container. I've tried looking for some documentation on how to do that, but I couldn't find any, so I had to do that simply with trials and errors, starting with simply running kubelet from hyperkube image via Docker, looking into error messages and googling each one for a solution.

I documented it here, even though it's probably incomplete, as I didn't test everything yet.

@neolit123
Copy link
Member

neolit123 commented Jan 22, 2020

I documented it here, even though it's probably incomplete, as I didn't test everything yet.

this type of documentation improvement is nice to have.
you can get feedback on this in the #sig-node channel on k8s slack or if you join their meeting:
https://github.com/kubernetes/community/tree/master/sig-node

@anarkiwi
Copy link

anarkiwi commented Oct 9, 2020

Just a note here - I had my own version of nsenter installed (via pip3 install) on ubuntu20. This caused a continual kubelet crash which took a while to diagnose - kubelet of course wanted nsenter from util-linux.

It sure would be nice if the preflight called out the dependency on the distro version of nsenter.

Oct 09 18:45:59 faucet kubelet[296741]: E1009 18:45:59.180219 296741 pod_workers.go:191] Error syncing pod db24ffc3-ef58-45b8-87f9-6da3dd701ce7 ("coredns-f9fd979d6-7d488_kube-system(db24ffc3-ef58-45b8-87f9-6da3dd701ce7)"), skipping: failed to "StartContainer" for "coredns" with CrashLoopBackOff: "back-off 40s restarting failed container=coredns pod=coredns-f9fd979d6-7d488_kube-system(db24ffc3-ef58-45b8-87f9-6da3dd701ce7)"
Oct 09 18:45:59 faucet kubelet[296741]: W1009 18:45:59.259650 296741 docker_sandbox.go:402] failed to read pod IP from plugin/docker: networkPlugin cni failed on the status hook for pod "coredns-f9fd979d6-mm6kj_kube-system": unexpected command output usage: nsenter [-h] --target PID [--proc PROCFS] [--user] [--net] [--ipc]
Oct 09 18:45:59 faucet kubelet[296741]: [--uts] [--pid] [--mnt] [--all]
Oct 09 18:45:59 faucet kubelet[296741]: [command [command ...]]
Oct 09 18:45:59 faucet kubelet[296741]: nsenter: error: argument --net: ignored explicit argument '/proc/327613/ns/net'
Oct 09 18:45:59 faucet kubelet[296741]: with error: exit status 2
Oct 09 18:45:59 faucet kubelet[296741]: E1009 18:45:59.260055 296741 pod_workers.go:191] Error syncing pod 5b915605-4d4a-4e22-967c-aaeee184f0cd ("coredns-f9fd979d6-mm6kj_kube-system(5b915605-4d4a-4e22-967c-aaeee184f0cd)"), skipping: failed to "StartContainer" for "coredns" with CrashLoopBackOff: "back-off 40s restarting failed container=coredns pod=coredns-f9fd979d6-mm6kj_kube-system(5b915605-4d4a-4e22-967c-aaeee184f0cd)"

@XuanYang-cn
Copy link

/joke

@k8s-ci-robot
Copy link
Contributor

@XuanYang-cn: A termite walks into a bar and asks “Is the bar tender here?”

In response to this:

/joke

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ffromani
Copy link
Contributor

/triage accepted

@k8s-ci-robot k8s-ci-robot added the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Jun 24, 2021
@endocrimes
Copy link
Member

/priority important-longterm

@k8s-ci-robot k8s-ci-robot added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Jun 24, 2021
@ffromani
Copy link
Contributor

/remove-priority backlog

@k8s-triage-robot
Copy link

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

  • Confirm that this issue is still relevant with /triage accepted (org members only)
  • Close this issue with /close

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

@k8s-ci-robot k8s-ci-robot removed the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Feb 8, 2023
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Feb 8, 2023
@thockin thockin closed this as not planned Won't fix, can't repro, duplicate, stale Feb 9, 2023
@camilamacedo86
Copy link

camilamacedo86 commented Feb 14, 2023

HI @thockin,

Could you please share why did you close this one?
Was it not accepted in the triage? If so, what is the reason?

@thockin
Copy link
Member

thockin commented Mar 9, 2023

Mostly backlog grooming. This is 7 years old and the last meaningful update was 3 years ago. Does it serve a purpose to leave it open? Is someone going to do it? It's a wonderful idea....if someone will act on it, let's reopen?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubelet area/system-requirement help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/documentation Categorizes issue or PR as related to documentation. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

No branches or pull requests