New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support removing ephemeral container from pod #84764
Comments
/assign @verb |
/sig node Thanks @shuiqing05 for opening this issue! Is your concern mainly about reclaiming the resources once the ephemeral container has exited? This is a thing that should happen automatically regardless of whether the container has been removed from the PodSpec. It shouldn't require a manual step. (To be clear, I'm not saying this is current behavior.) Deleting the container would be required if, for example, you wanted to reclaim the resources without the container having exited. |
My usecases for removing closely match @shuiqing05 ones:
|
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@dguendisch @shuiqing05 |
For my usecases I wouldn't need duration based policies. My main point would be to be able to manually delete/remove the eph container (after I'm done debugging the problem). Nice to have would be the option/policy of having it automatically removed after the container exited. |
Hi - what's the status of this? I'm really interested in using ephemeral containers for some of my usecases but, like others, I'd like to be able to remove them and reclaim all resources as the pod itself might live for a long time but these e-containers would come and go quite frequently. |
@dguendisch @tedyu Rather than functionality in the kubelet to automatically remove a container based on a policy enforced by the kubelet, I prefer marking a pod as tainted when an ephemeral container has been added (#84353) and leaving it up the administrator how to handle this situation. I could imagine a controller than implements an automatic removal policy. @duglin It hasn't been designed yet. Right now I'm focused on a debugging MVP. If anyone wants to pick this up the next step would be to gather requirements and propose an update to the KEP. |
I'm considering how to do this for 1.19 /remove-lifecycle stale |
A lot of use cases have been discussed here, so I thought it would be useful to enumerate them:
Given these, I'm planning on proposing simple deletes without in-place restarts or other updates. |
One concern about allowing ephemeral containers to be removed is that it removes information that could have been used for policy decisions. For this reason it will need to be configurable. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale |
I want to know if anyone is working on this issue now? Is there a pr about this issue? |
@zhhray I have a goal to land this in 1.21. It needs KEP modification. PR is kubernetes/enhancements#1690 |
/priority important-longterm |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
@k8s-triage-robot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/reopen |
@m-yosefpor: You can't reopen an issue/PR unless you authored it or you are a collaborator. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@shuiqing05 can you please reopen this issue? |
The KEP was updated in the v1.23 release cycle kubernetes/enhancements#2892. There are some discussions in #103354 (comment). |
Can I say I can't use ephemeral container to run a periodic check for a long-lived pods by using current release? |
No, you can run the periodic check. (The pod's ephemeral containers would be +1 each time and ephemeral containers would be too many after a long period.) The Pod can have many ephemeral containers with different names. This issue is about removing exited containers from pod's ephemeral container list. Once the ephemeral container was added to the Pod, it cannot be removed. |
Right. My concern is if ephemeral containers run too many times after a long period, the list will be too long and make pod manifest reach 1.5MiB etcd limiation. |
Using ephemeral containers to run a planned periodic check indefinitely for a long-running pod will monotonically increase the length of the ephemeral containers list, and is not a good use case for this feature. You could include what you need to run the periodic check in one of the normal containers and use something like |
Thanks @liggitt. The problem here is: |
What you're describing sounds more like a health or liveness check, not one-off debugging... including what you need to run that in the normal containers for the app is required. |
Could you please help to provide the reason for why not to add removing ephemeral container feature? Is ephermal container still only designed for one-off debugging purpose? |
The reason remove was not supported was for auditability (losing API-visible information about what has been added to the pod and run) and to avoid race conditions where an ephemeral container with the same name was removed and re-added and appeared to the kubelet to change in-place. |
@liggitt Thanks for your info. |
@tonyaw That's correct, we won't be implementing this in the context of kubernetes/enhancements#277. It's possible we'd revisit this in the context of a new KEP, since it's popular, but I wouldn't expect it in the short term. I agree that your use case sounds like a bad fit for ephemeral containers, which are best effort and shouldn't be part of normal pod operation. You could implement this as a health check that doesn't use ICMP (and so doesn't require NET_RAW) but I'd be worried about all of the pods exiting at once if the destination host became unavailable for some reason. |
If you need for audit reasons then why bloat the pod spec and status? Instead create yet another companion resource which has audit information (gets created when the first debug/attach is issued). This way you do not have to change the pod spec, also allows you to control how much audit information needs to be kept which could also be made configurable. Operators/Consumers can also switch on/off auditing for selected resources which they control etc. Spec is used to describe the desired state and should not be polluted with audit trail information which can keep growing. Auditing is a cross cutting concern. A desired state should not have ever-growing audit information as part of it. IMHO it simply does not belong there. |
Is there any update on this? Any plans to allow deleting ephemeral containers? |
#124271 |
What would you like to be added:
Ephemeral container can be removed from pod gracefully.
Why is this needed:
In our K8s cluster, there are many deployed pods. We aim to use ephemeral containers as troubleshooting containers. Thus, for troubleshooting, many ephemeral containers are potentially created.
Once the need of an ephemeral container is gone, the container should be removed gracefully. Resource of the container should be released. The container should not be accessed any more for security. And the container also should be removed from pod.
Unless the pod which contains ephemeral container is destroyed, there are no ways to remove the ephemeral container totally.
However, this approach is really "intrusive". (For example, traffic may be impacted) We need a way to remove ephemeral container gracefully like how we creating it.
The text was updated successfully, but these errors were encountered: