Add a client/server implementation of the container runtime #13768

brendandburns · 2015-09-09T22:10:50Z

Currently, any container runtime has to be linked into the kubelet. This makes experimentation difficult, and prevents users from landing an alternate container runtime without landing code in core kubernetes.

To facilitate experimentation and to enable user choice, we should add a client/server implementation of the container runtime interface.

This implementation will simply encode the requests, send them to a server where they will be decoded and sent into an instance of the container runtime interface.

However, this enables container runtime implementations to be built and maintained outside of the core kubernetes tree.

@dchen1107 @smarterclayton
@kubernetes/goog-node

smarterclayton · 2015-09-09T22:12:00Z

@kubernetes/rh-cluster-infra

brendandburns · 2015-09-09T22:14:41Z

@feiskyer

bgrant0607 · 2015-09-09T22:42:02Z

cc @thockin re. consistency and interaction with network and storage plugins

vishh · 2015-09-09T23:03:11Z

We need two kinds of runtime abstractions in kubelet as of now - one for docker which operates at the container level, and another for rocket which operates at the pod level. Are we planning to expose both pod and container runtime plugins?

yujuhong · 2015-09-09T23:13:11Z

We need two kinds of runtime abstractions in kubelet as of now - one for docker which operates at the container level, and another for rocket which operates at the pod level. Are we planning to expose both pod and container runtime plugins?

I think we need to redefine the abstraction (container runtime interface) if we are going to expose it. We may need to ask rkt to expose a container-level api to work with kubelet, since most container runtimes operate at that level. That said, kubelet still relies on the container runtime (docker or rkt) to save pod level information (e.g., what pod a container belongs to), so a container runtime would still need to be aware of the existence of pod.

Also, I am working a pod event generator (#12802) for kubelet's performance improvement, so I'd appreciate to be kept in the loop for related changes :)

feiskyer · 2015-09-10T01:04:43Z

I think we need to redefine the abstraction (container runtime interface) if we are going to expose it. We may need to ask rkt to expose a container-level api to work with kubelet, since most container runtimes operate at that level. That said, kubelet still relies on the container runtime (docker or rkt) to save pod level information (e.g., what pod a container belongs to), so a container runtime would still need to be aware of the existence of pod.

I think we should provides both pod-level and container-level api to work with kubelet. As you say, container runtime must be aware of the existence of pod, so apis like creating and deleting pod should be pod-level. But other apis like exec/logs/attach should be container-level since they operates directly on container.

yujuhong · 2015-09-10T01:54:32Z

The upside of keeping a container-level api is that we can keep the majority of the logic in kubelet, which would reduce code duplication and maintenance overhead.
I think we can maybe get away by passing pod UID & name as arguments when querying the runtime, if necessary. I need to think more about it.

/cc @yifan-gu for rkt.

feiskyer · 2015-09-10T02:25:46Z

@yujuhong Rkt and hyper ( at #13079 ) are both operating at pod-level, their concept is consistent with kubernetes' pod. I think it's a good idea to reuse their pod apis in kubernetes, not split them into container-level.

brendandburns · 2015-09-10T03:12:45Z

We currently have an exiting interface Runtime (https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/container/runtime.go#L53)

that both Docker and RKT implement.

I'm suggesting a network client/server version of that interface.

We can refactor/rewrite to our hearts content, but those refactors will just refactor the client/server interface also.

--brendan

thockin · 2015-09-10T04:17:33Z

+1 to the idea

I don't see why we would keep any container-centric API. The abstraction
is pods. If a given runtime has to jump through hoops to implement pods
(I'm looking at you, docker) that their own problem.

On Wed, Sep 9, 2015 at 8:13 PM, Brendan Burns notifications@github.com
wrote:

We currently have an exiting interface Runtime
https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/container/runtime.go#L53

that both Docker and RKT implement.

I'm suggesting a network client/server version of that interface.

We can refactor/rewrite to our hearts content, but those refactors will
just refactor the client/server interface also.

--brendan

—
Reply to this email directly or view it on GitHub
#13768 (comment)
.

brendandburns · 2015-09-10T16:39:38Z

100% agree, the Runtime interface is Pod based, that's what we should have people implement.

yifan-gu · 2015-09-10T18:33:48Z

+1 for the direction.

The core of the runtime interface is the SyncPod(), which the container runtime does the reconciliation in it. This is a bad abstraction as it hands to much work to container runtime and results in much code duplication.

Ideally container runtime should just implement RunPod(), KillPod(), and the kubelet should tell it which action to take during reconciliation. The trade-off is we will not be able to update a single container in a pod, that is if a pod has two containers and one of the container spec changes, then the other one will be killed as well.

yifan-gu · 2015-09-10T18:53:16Z

Besides, i would suggest using grpc as

No encoding/decoding codes needed to be written.
It is supposed to be faster than json (though currently it's not :( )
We can get client bindings for free.
Server interfaces are generated as well. Each runtime only needs to implement them.

yujuhong · 2015-09-10T19:02:34Z

We currently have an exiting interface Runtime (https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/container/runtime.go#L53)

That's the interface I talked about, which was defined in a rush of time and needed to be polished (reiterated) more. I have filed issues before for that, and as @yifan-gu mentioned, even at pod-level abstraction there are many obvious imperfections.
I agree that client-server architecture is good in the long run. It's also important that we work towards a good API before committing and exposing it for public consumption.

yujuhong · 2015-09-10T19:05:40Z

100% agree, the Runtime interface is Pod based, that's what we should have people implement.

Not that I am against pod-level abstraction, but this alone is not the best argument IMO. We still have code (container garbage collection, image management, etc) bypassing the interface to talk to docker directly for direct container information. We should work out the requirements and fix the interface.

pmorie · 2015-09-11T00:55:37Z

+1 to this direction

On Wednesday, September 9, 2015, Brendan Burns notifications@github.com
wrote:

Currently, any container runtime has to be linked into the kubelet. This
makes experimentation difficult, and prevents users from landing an
alternate container runtime without landing code in core kubernetes.

To facilitate experimentation and to enable user choice, we should add a
client/server implementation of the container runtime interface.

This implementation will simply encode the requests, send them to a server
where they will be decoded and sent into an instance of the container
runtime interface.

However, this enables container runtime implementations to be built and
maintained outside of the core kubernetes tree.

@dchen1107 https://github.com/dchen1107 @smarterclayton
https://github.com/smarterclayton
@kubernetes/goog-node https://github.com/orgs/kubernetes/teams/goog-node

—
Reply to this email directly or view it on GitHub
#13768.

feiskyer · 2015-09-11T06:49:14Z

I notice there are three places talking directly to dockerclient out of runtime:

container garbage collection
image management
securitycontext

I suggest move securitycontext to docker itself as it is only docker related.

For container garbage collection and image management, I suggest refactor them talking to Runtime interface instead.

thockin · 2015-09-15T05:43:19Z

Yu-Ju,

There's certainly leaks in the abstraction, but if you consider something
like rkt or hyper, a container interface doesn't make much sense.
Everything that is more granular than a pod should be encapsulated behind
the pod interface.

The core of kubelet should only know about pods and abstract containers,
and nothing about the details of the runtime. Garbage collection is the
runtime's problem.

I understand that this may not be the case right now :)

On Thu, Sep 10, 2015 at 12:06 PM, Yu-Ju Hong notifications@github.com
wrote:

100% agree, the Runtime interface is Pod based, that's what we should have
people implement.

Not that I am against pod-level abstraction, but this alone is not the
best argument IMO. We still have code (container garbage collection, image
management, etc) bypassing the interface to talk to docker directly for
direct container information. We should work out the requirements and fix
the interface.

—
Reply to this email directly or view it on GitHub
#13768 (comment)
.

yifan-gu · 2015-09-15T18:35:55Z

@yujuhong I think this is also something we need to consider when implementing the event generator in #13571

yujuhong · 2015-09-15T23:06:10Z

Yu-Ju,

There's certainly leaks in the abstraction, but if you consider something
like rkt or hyper, a container interface doesn't make much sense.
Everything that is more granular than a pod should be encapsulated behind
the pod interface.

Hmm...I wasn't actually advocating the container-level abstraction. I brought it up because @dchen1107 mentioned it the other day, and I thought it was worth throwing into the discussion :)

The core of kubelet should only know about pods and abstract containers,
and nothing about the details of the runtime. Garbage collection is the
runtime's problem.

I understand that this may not be the case right now :)

Agreed, but at the same time it seems like we want to have tighter control over disk space allocation and garbage collection, so the interface may not be as high-level as we want.
(@vishh would probably know more about the requirements)
These are probably some indications that we need to put into more thoughts or wait until we can have a more mature interface for the client-server architecture.

yujuhong · 2015-09-15T23:13:24Z

@yujuhong I think this is also something we need to consider when implementing the event generator in #13571

Agreed. For now, the generic event in generator #13571 uses the runtime interface to support container runtimes that don't support container/pod event streams. It shouldn't be too hard to adapt that to the new interface, or to implement the client-server architecture. For container runtimes which do support container event streams, they can implement one on their own.

yifan-gu · 2015-09-15T23:50:41Z

@yujuhong By what I said I mean we might need to add pod level events. And in most of the time kubelet should only manage on pod level events. The container events should be handled by runtime according to the restartpolicy. However I tend to agree that container level events/interfaces are also necessary if we want to provide some fine grain introspection. (e.g. get per container's log, status, etc). Though I am trying to think if we can just include these in the pod level events/interfaces..

yujuhong · 2015-09-16T01:02:07Z

@yifan-gu, the current container events are packaged as pod-level events, even though they include some container information (ContainerStarted, etc). Having individual runtime handles all container-level events is a separate question though. If kubelet can fire and forget a pod without babysitting it, we may not even need pod-level events (other than maybe sending the pod status to the apiserver) :)

feiskyer · 2015-09-16T11:20:26Z

@brendandburns Image manager refactored for client/server implementation of the container runtime.

jdef · 2015-09-24T21:30:49Z

+1 for this. The mesos team is thinking of implementing a runtime for better integration with the mesos-native containerizer.

smarterclayton · 2015-09-24T21:39:44Z

I see some significant downsides to client/server, including more hops into the container for exec/attach/portforward scenarios. I do think that we should support a client/server containerizer, but for the foreseeable future I would want my containerizer compiled in if I was shipping a deployment.

Conditionally compiled Kubelets provided with a particular host type (mesos, rocket, systemd+docker) has the same benefits without adding an extra hop.

dchen1107 · 2015-09-24T23:50:01Z

@smarterclayton inefficiency caused by client/server runtime is known. But please note that we didn't plan to make all container runtimes to use this model, and we do plan to continue supporting docker and rocket as Container Runtime as today, just with a better API.

The only candidate we have today for this client / server model is hyper:
#13079 (comment)

resouer · 2015-10-13T09:51:49Z

@brendandburns I'm interested in whether runc suitable for this model?

vishh · 2015-10-13T17:24:17Z

runc is essentially a thin wrapper around the Linux APIs as of now. And
so, I don't think it will be a suitable candidate for being a container
runtime, at-least in its initial versions.

On Tue, Oct 13, 2015 at 2:52 AM, Harry Zhang notifications@github.com
wrote:

@brendandburns https://github.com/brendandburns I'm interested in
whether runc suitable for this model?

—
Reply to this email directly or view it on GitHub
#13768 (comment)
.

resouer · 2015-10-16T06:34:47Z

@vishh agreed. Actually, I am keeping a close eye on docker 1.9, which has the same intention.

[update] moby/moby#14212

@brendandburns

Automatic merge from submit-queue Proposal: client/server container runtime Ref #25899 #13768 Proposal for client/server container runtime CC @brendandburns @dchen1107 @kubernetes/goog-node @kubernetes/sig-node

@brendandburns

…er-proposal Automatic merge from submit-queue Proposal: client/server container runtime Ref kubernetes#25899 kubernetes#13768 Proposal for client/server container runtime CC @brendandburns @dchen1107 @kubernetes/goog-node @kubernetes/sig-node

yujuhong · 2017-06-02T01:08:47Z

We already implemented CRI and using it by default in 1.6. Closing.

brendandburns mentioned this issue Sep 9, 2015

Add hyper container runtime #13079

Closed

5 tasks

dchen1107 added sig/node Categorizes an issue or PR as relevant to SIG Node. help-wanted priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. labels Sep 9, 2015

bgrant0607 added the area/extensibility label Sep 9, 2015

bgrant0607 added this to the v1.2-candidate milestone Sep 9, 2015

feiskyer mentioned this issue Sep 16, 2015

Refactor image manager for client/server implementation of the container runtime #14035

Merged

derekwaynecarr mentioned this issue Sep 25, 2015

Additional pod life-cycle hook to support commit+push after container exits #14561

Closed

feiskyer mentioned this issue Sep 26, 2015

Refactor image manager for client/server container runtime #14599

Merged

feiskyer mentioned this issue Oct 3, 2015

Use runtime instread of dockerclient in container gc #15051

Merged

yujuhong mentioned this issue Oct 23, 2015

Container garbage collection should define MinAge as age since the container termination #13562

Closed

feiskyer mentioned this issue Nov 10, 2015

Proposal: client/server container runtime #17048

Merged

dchen1107 modified the milestones: next-candidate, v1.2-candidate Feb 19, 2016

bgrant0607 removed the help-wanted label Aug 30, 2016

yujuhong closed this as completed Jun 2, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a client/server implementation of the container runtime #13768

Add a client/server implementation of the container runtime #13768

brendandburns commented Sep 9, 2015

smarterclayton commented Sep 9, 2015

brendandburns commented Sep 9, 2015

bgrant0607 commented Sep 9, 2015

vishh commented Sep 9, 2015

yujuhong commented Sep 9, 2015

feiskyer commented Sep 10, 2015

yujuhong commented Sep 10, 2015

feiskyer commented Sep 10, 2015

brendandburns commented Sep 10, 2015

thockin commented Sep 10, 2015

brendandburns commented Sep 10, 2015

yifan-gu commented Sep 10, 2015

yifan-gu commented Sep 10, 2015

yujuhong commented Sep 10, 2015

yujuhong commented Sep 10, 2015

pmorie commented Sep 11, 2015

feiskyer commented Sep 11, 2015

thockin commented Sep 15, 2015

yifan-gu commented Sep 15, 2015

yujuhong commented Sep 15, 2015

yujuhong commented Sep 15, 2015

yifan-gu commented Sep 15, 2015

yujuhong commented Sep 16, 2015

feiskyer commented Sep 16, 2015

jdef commented Sep 24, 2015

smarterclayton commented Sep 24, 2015

dchen1107 commented Sep 24, 2015

resouer commented Oct 13, 2015

vishh commented Oct 13, 2015

resouer commented Oct 16, 2015

yujuhong commented Jun 2, 2017

Add a client/server implementation of the container runtime #13768

Add a client/server implementation of the container runtime #13768

Comments

brendandburns commented Sep 9, 2015

smarterclayton commented Sep 9, 2015

brendandburns commented Sep 9, 2015

bgrant0607 commented Sep 9, 2015

vishh commented Sep 9, 2015

yujuhong commented Sep 9, 2015

feiskyer commented Sep 10, 2015

yujuhong commented Sep 10, 2015

feiskyer commented Sep 10, 2015

brendandburns commented Sep 10, 2015

thockin commented Sep 10, 2015

brendandburns commented Sep 10, 2015

yifan-gu commented Sep 10, 2015

yifan-gu commented Sep 10, 2015

yujuhong commented Sep 10, 2015

yujuhong commented Sep 10, 2015

pmorie commented Sep 11, 2015

feiskyer commented Sep 11, 2015

thockin commented Sep 15, 2015

yifan-gu commented Sep 15, 2015

yujuhong commented Sep 15, 2015

yujuhong commented Sep 15, 2015

yifan-gu commented Sep 15, 2015

yujuhong commented Sep 16, 2015

feiskyer commented Sep 16, 2015

jdef commented Sep 24, 2015

smarterclayton commented Sep 24, 2015

dchen1107 commented Sep 24, 2015

resouer commented Oct 13, 2015

vishh commented Oct 13, 2015

resouer commented Oct 16, 2015

yujuhong commented Jun 2, 2017