Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shared PID and UTS namespaces #1615

Closed
bgrant0607 opened this issue Oct 7, 2014 · 108 comments
Closed

Shared PID and UTS namespaces #1615

bgrant0607 opened this issue Oct 7, 2014 · 108 comments
Assignees
Labels
area/isolation area/kubelet priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@bgrant0607
Copy link
Member

Assuming moby/moby#8211 goes in, we should decide whether we want containers within a pod to share IPC namespaces by default (which fits the "logical host" model for pods and keeps the API simple at the cost of slightly reduced isolation), or provide an option (in which case we could wait until users ask for it).

/cc @thockin

@thockin
Copy link
Member

thockin commented Oct 7, 2014

I instinctively feel like sharing IPC might be OK, but what is the
principle behind it? Why not share PID and User namespaces, too?

On Mon, Oct 6, 2014 at 9:11 PM, bgrant0607 notifications@github.com wrote:

Assuming moby/moby#8211 moby/moby#8211
goes in, we should decide whether we want containers within a pod to share
IPC namespaces by default (which fits the "logical host" model for pods and
keeps the API simple at the cost of slightly reduced isolation), or provide
an option (in which case we could wait until users ask for it).

/cc @thockin https://github.com/thockin

Reply to this email directly or view it on GitHub
#1615.

@bgrant0607
Copy link
Member Author

@thockin I agree we should likely treat these cases similarly to the extent possible. The principle is the same one behind shared network namespaces and volumes within a pod -- facilitating communication / interaction within a pod.

@thockin
Copy link
Member

thockin commented Oct 7, 2014

Devil's advocate: Signals have to be within a PID namespace. Aren't
signals a form of communication?

On Mon, Oct 6, 2014 at 10:39 PM, bgrant0607 notifications@github.com
wrote:

@thockin https://github.com/thockin I agree we should likely treat
these cases similarly to the extent possible. The principle is the same one
behind shared network namespaces and volumes within a pod -- facilitating
communication / interaction within a pod.

Reply to this email directly or view it on GitHub
#1615 (comment)
.

@bgrant0607
Copy link
Member Author

@thockin: How is that playing devil's advocate? It sounds like you're agreeing with my previous statement.

@erictune
Copy link
Member

IPC namespaces can contain large objects (e.g. shm segments).
PID, UTS, NS, USER, and probably NET do not.

IPC objects (shms, sems, msqs) can be created by user and can have significant memory usage in typical use cases.
PID, UTS, NS, USER do not for typical use cases.
NET should not be large in "typical" use cases.

IPC objects live beyond the processes in a container, and so would need to be cleaned up in concert with the container or pod lifetime. PID automatically cleans up when process dies.
UTS, USER, and NS don't have user created objects.
NET, I'm not sure if we would give users permission to create objects with lifetime issues.

So, IPC is quite different from the others.

@thockin
Copy link
Member

thockin commented Oct 10, 2014

My point was that if the principle for sharing is "communication", then we
should also be arguing for shared PID namespace in a pod, but we're not.

On Tue, Oct 7, 2014 at 12:06 AM, bgrant0607 notifications@github.com
wrote:

@thockin https://github.com/thockin: How is that playing devil's
advocate? It sounds like you're agreeing with my previous statement.

Reply to this email directly or view it on GitHub
#1615 (comment)
.

@erictune
Copy link
Member

Comparison

IPC ns per pod:

  • charge memory of all IPC objects to pod.
  • delete IPC object only at pod termination, not at single-container restart.
  • user responsible for clearing out ipc object on container restart.
  • shm segments are basically the same as multiple mmaps of a /tmpfs file. if we think /tmpfs as a pod volume makes sense, I guess so does pod-scope IPC ns.

IPC ns per container:

  • charge memory of all IPC objects to container
  • delete IPC objects at pod termination
  • delete IPC objects at container restart? Could be useful. Not clear if this is what users want, but could work. What will docker do?

@thockin
Copy link
Member

thockin commented Oct 10, 2014

Charging doesn't enter the picture. First, there is no concept of
pod-scope resources yet, and second, memory chargin is first-touch always.

It's entirely about visibility and lifetime. I can imagine use-cases for
both modes, but I want to resist the urge to make it an API feature.

On Fri, Oct 10, 2014 at 10:08 AM, erictune notifications@github.com wrote:

Comparison

IPC ns per pod:

  • charge memory of all IPC objects to pod.
  • delete IPC object only at pod termination, not at single-container
    restart.
  • user responsible for clearing out ipc object on container restart.
  • shm segments are basically the same as multiple mmaps of a /tmpfs
    file. if we think /tmpfs as a pod volume makes sense, I guess so does
    pod-scope IPC ns.

IPC ns per container:

  • charge memory of all IPC objects to container
  • delete IPC objects at pod termination
  • delete IPC objects at container restart? Could be useful. Not clear
    if this is what users want, but could work. What will docker do?

Reply to this email directly or view it on GitHub
#1615 (comment)
.

@bgrant0607 bgrant0607 changed the title Shared IPC namespaces Shared namespaces Oct 10, 2014
@bgrant0607
Copy link
Member Author

In case it wasn't clear, IMO, we should opt for shared namespaces within a pod. I agree we should not surface an option in the API.

@bgrant0607
Copy link
Member Author

moby/moby#8211 has been replaced by moby/moby#8835

@bgrant0607
Copy link
Member Author

Shared IPC namespaces went in: moby/moby#9074

@bgrant0607
Copy link
Member Author

PID namespaces haven't gone anywhere AFAICT: moby/moby#8568

@thockin
Copy link
Member

thockin commented Dec 9, 2014

We should do IPC namespaces as soon as there's a new docker build

On Fri, Nov 21, 2014 at 9:09 AM, bgrant0607 notifications@github.com
wrote:

Shared IPC namespaces went in: moby/moby#9074
moby/moby#9074

Reply to this email directly or view it on GitHub
#1615 (comment)
.

@goltermann goltermann added the priority/backlog Higher priority than priority/awaiting-more-evidence. label Dec 17, 2014
@mrunalp
Copy link
Contributor

mrunalp commented Jan 14, 2015

I would be interested in helping drive this forward by submitting PRs. Has the docker build been updated to include the --ipc feature?

@thockin
Copy link
Member

thockin commented Jan 15, 2015

I think nobody is working on this, so have at it. At least GCE uses docker
1.4.1.

I think we have a bug in the docker names we build - if a user specifies a
container named "net" there will be something of a conflict. Once we add
IPC it's not longer "net" any more but "pod". Maybe we need a new field in
the docker name [yay :( longer] to disambiguate

On Wed, Jan 14, 2015 at 10:35 AM, Mrunal Patel notifications@github.com
wrote:

I would be interested in helping drive this forward by submitting PRs. Has
the docker build been updated to include the --ipc feature?

Reply to this email directly or view it on GitHub
#1615 (comment)
.

@mrunalp
Copy link
Contributor

mrunalp commented Jan 20, 2015

@thockin Just starting to work on this. Any suggestions for docker name?

@vishh vishh removed their assignment Jan 20, 2015
@vishh
Copy link
Contributor

vishh commented Jan 20, 2015

@mrunalp: Unassigned myself since you are working on this. Feel free to reach out if I can be of any help!

@thockin
Copy link
Member

thockin commented Jan 20, 2015

Let's not add a new field in the name. Maybe change:

        networkContainerName  = "net"
        NetworkContainerImage = "kubernetes/pause:latest"

to

       PodInfraContainerName  = "POD"  // caps are not allowed by users, but are OK for docker.
       PodInfraContainerImage = "kubernetes/pause:latest"

@mrunalp
Copy link
Contributor

mrunalp commented Jan 20, 2015

@thockin That sounds good to me.
BTW, go-dockerclient doesn't have support for specifying IpcMode, so I will take a detour to go fix that first.

@vishh Thanks! I will ping you if I need any help.

@mrunalp
Copy link
Contributor

mrunalp commented Jan 20, 2015

@mrunalp
Copy link
Contributor

mrunalp commented Jan 21, 2015

I have started making the changes cherry picking the go-dockerclient change while we are waiting for that to get merged. Hopefully will have some basic testing done by end of day today.

@mrunalp
Copy link
Contributor

mrunalp commented Jan 22, 2015

Created a PR for updating go-dockerclient #3726. Next thing that's blocking me is Fedora 20 not having a recent enough docker package, so I can't test my changes on vagrant setup. I am working with Fedora packaging folks to get that fixed.

k8s-github-robot pushed a commit that referenced this issue Feb 21, 2018
Automatic merge from submit-queue (batch tested with PRs 58716, 59977, 59316, 59884, 60117). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add ShareProcessNamespace to PodSpec

**What this PR does / why we need it**: This adds an option to the Kubernetes API to configure sharing a process (PID) namespace between the containers in a pod, as described in the [shared pid proposal](https://github.com/verb/community/blob/master/contributors/design-proposals/node/pod-pid-namespace.md).

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
WIP #1615

**Special notes for your reviewer**: 
Questions for API reviewer:
- [ ] Is the documentation sufficient to describe the new option?
- [x] Does the new field better belong in the security context?
- [x] is prepending "alpha" to json/proto fields the right thing to do?

**Release note**:

```release-note
v1.Pod now has a field to configure whether a single process namespace should be shared between all containers in a pod. This feature is in alpha preview.
```
k8s-github-robot pushed a commit that referenced this issue Feb 22, 2018
Automatic merge from submit-queue (batch tested with PRs 60148, 60022, 59125, 60068, 60154). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Adding support for per-pod process namespace sharing in kubelet

**What this PR does / why we need it**: This enables process namespace sharing between containers in a pod as described in the [Shared PID Namespace](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node/pod-pid-namespace.md#container-runtime-interface-changes) proposal but leaves it disconnected pending merge of #58716.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
WIP #1615

**Special notes for your reviewer**: 

**Release note**:

```release-note
NONE
```
k8s-github-robot pushed a commit that referenced this issue Feb 23, 2018
Automatic merge from submit-queue (batch tested with PRs 59463, 59719, 60181, 58283, 59966). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Set shared PID namespace mode based on PodSpec

**What this PR does / why we need it**: This PR enables pod process namespace sharing as an alpha feature, as described in [Shared PID Namespace Proposal](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node/pod-pid-namespace.md).

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
WIP #1615

**Special notes for your reviewer**:
/assign @dchen1107 

**Release note**:

```release-note
When the `PodShareProcessNamespace` alpha feature is enabled, setting `pod.Spec.ShareProcessNamespace` to `true` will cause a single process namespace to be shared between all containers in a pod.
```
@verb
Copy link
Contributor

verb commented May 7, 2018

PID namespace sharing was included as an alpha feature in 1.10. It would be nice to coordinate its beta with removing the option to configure it in the kubelet (#41938), which we're planning to do in 1.12. This would leave 2 releases to gather feedback on the alpha.

Does anyone have any concerns they'd like to see addressed before promoting this feature to beta? In particular this is a question for @smarterclayton and @liggitt since it means committing to the API change.

@liggitt
Copy link
Member

liggitt commented May 23, 2018

Does anyone have any concerns they'd like to see addressed before promoting this feature to beta?

I don't have any issue with promotion... I primarily cared about maintaining compatible behavior, which the pod spec field did/does.

@yyb196
Copy link

yyb196 commented Jun 7, 2018

@verb you said that "Docker 1.12.x doesn't support shared pid and it cannot be enabled" but i found that docker-1.12.0 has supported share pid name between containers. refer to moby/moby#22481 . Is there a bug which stop we using it at veraion 1.12.x?
Any feedback will be appreciate.

@verb
Copy link
Contributor

verb commented Jun 7, 2018

@yyb196 yes, have a look at kubernetes/community#207 (comment)

k8s-github-robot pushed a commit that referenced this issue Aug 9, 2018
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Promote ShareProcessNamespace to beta

**What this PR does / why we need it**: The ability to configure PID namespace sharing per-pod was added as an alpha feature in 1.10. This promotes the feature to beta and makes the feature available by default.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
WIP #1615

**Special notes for your reviewer**:
/assign @yujuhong 

**Release note**:

```release-note
The PodShareProcessNamespace feature to configure PID namespace sharing within a pod has been promoted to beta.
```
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 5, 2018
@verb
Copy link
Contributor

verb commented Sep 6, 2018

Update: ShareProcessNamespace is graduating to beta in 1.12. I'll leave this issue open until GA.

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 6, 2018
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 5, 2018
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 4, 2019
@verb
Copy link
Contributor

verb commented Jan 7, 2019

Shared PID namespaces was promoted to beta in the last release. afaik no one is working on configurable UTS namespaces. I guess these issues aren't intended to be long-lived, and I'm tired of fighting the bot, so I'll close this as fixed. Anyone wanting to track the feature further can follow kubernetes/enhancements#495.

/remove-lifecycle rotten
/close

@k8s-ci-robot
Copy link
Contributor

@verb: Closing this issue.

In response to this:

Shared PID namespaces was promoted to beta in the last release. afaik no one is working on configurable UTS namespaces. I guess these issues aren't intended to be long-lived, and I'm tired of fighting the bot, so I'll close this as fixed. Anyone wanting to track the feature further can follow kubernetes/enhancements#495.

/remove-lifecycle rotten
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jan 7, 2019
stlaz pushed a commit to stlaz/kubernetes that referenced this issue Jun 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/isolation area/kubelet priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

No branches or pull requests