New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Image volumes and container volumes #831
Comments
I guess so? But why not just use a git repo? On Thu, Aug 7, 2014 at 10:11 PM, Tim Hockin notifications@github.com
|
More plugins more better? I wanted to put it out there, since we do sort On Thu, Aug 7, 2014 at 10:18 PM, brendandburns notifications@github.com
|
More potential uses of this:
A git repo could be used for some of these cases, but for others it would be less than ideal. |
If the base image of a docker file (e.g. On the other hand, if it is creating using On Tue, Sep 30, 2014 at 9:00 PM, bgrant0607 notifications@github.com
|
How about the VOLUMES directives of a container? Any container can declare On the other hand, people can create data containers "FROM scratch", and On Wed, Oct 1, 2014 at 7:20 AM, erictune notifications@github.com wrote:
|
@thockin I like the initial idea of having new volume type to support docker's data volume container, which matches common package, shared package etc. concept internally. I also can see the potential use cases listed by @bgrant0607. But please don't go down to the road like Docker had today: declare a container as data volume, in which case, we are going to introduce another level complexity of dependencies to containers within a pod, or even dependencies between pods if a pod only has data volume container, etc. I think your initial idea of having a volume type which actually refers as a docker volume container or other read-only volume is a better approach for a long run. |
The interesting thing about docker volumes is that a container does not I don't think we know what people really want in this space yet, though. On Wed, Oct 1, 2014 at 10:57 AM, Dawn Chen notifications@github.com wrote:
|
It appears the same net effect as this issue can be achieved without introducing a new volume type using two containers in a pod and some shell wrapped around the underling container command lines (see #1589 (comment)). How to decide on container-as-a-volume vs. command-line-based sequencing?
|
There are a lot of ways to go about it and a new volume type is in my opinion not needed. I tried to standardize the way we structure data and make different volume providers possible. These range from Host Volumes, Data Volume Containers, Side Containers with additional logic to Volume as a Service, which is where k8s could integrate greatly. The start is already available via git as volume. I think the native Volumes in Docker are enough, but just lack a standard. The more detailed ideas are available at moby/moby#9277. |
I think the question is whether the data volume container should be represented as a container or as a volume. I prefer to think of them as volumes and find passive containers to be non-intuitive for users, problematic for management systems to deal with, and the source of weird corner-case behaviors. |
@bgrant0607 still they are supported in docker and therefore we should acknowledge they exist. I would love to see more integrated methods in k8s itself, which just expose a specific type of volume. I was hinting at that in my proposal via the VaaS approach. But I would dislike this approach reducing compatibility. |
+1 for supporting a container as a volume. I have a scenario where I have a container that has a bunch of data baked into it for use by other containers, helps keep the data "local" to the work being done. |
Whatever you decide, I hope you will make it clear in the documentation to save people time of searching around to find this information. Currently, the documentation for both compute and container engine:
It's important to note that using a git repo isn't the same. It requires the git repo to be securely accessible from Google Cloud (or wherever Kubernetes is being used). Further, it's unclear how non-public repositories would be accessible, unless the user & password is hard-coded into the Kubernetes GitRepo#repository JSON/YAML string. Also, it requires that the desired artifact(s) be checked in to source control. And it decouples the Docker image from the artifact (which may or may not be desirable). I will be working around this issue by moving the data that's in my container volume into a Dockerfile that layers on top of the container that wanted to use the volume, with ADD. The problem you're running into is that the community at large is encouraging the "container as volume" approach in websites and blog posts, and as a result people will continue to have difficulty. For example, the docker website itself says, "If you have some persistent data that you want to share between containers, or want to use from non-persistent containers, it's best to create a named Data Volume Container, and then to mount the data from it." (emphasis mine). Also, @erictune a container-only volume can be (and probably should be) written as "FROM scratch". I'd argue that if the user doesn't do it that way, that's their choice. |
+1 @rehevkor5 I am disappointed to hear that k8s doesn't support data volume containers. I am not sure how I am supposed to abstract r/w data away from the host now. I was under the impression that k8s was about abstracting away the host, but a host volume introduces all sorts of host-specfics, like having to share the same username/group for access to the data. @rehevkor5 I thought the same thing about data volume containers at first (should be written as FROM scratch) until I read this (which may or may not be correct): http://container42.com/2014/11/18/data-only-container-madness/ Your workaround seems to do just this? |
There are a few things going on here, most importantly (I think) some Kubernetes supports the notion of a writable "empty volume" that is shared Now the question comes down to "but I don't want an empty directory, I We could support myriad such "fetch once" volume kinds, git, cvs, svn, IMO: all of these things that can be run as a side-car container in your Now, let's think about the case of docker data volume containers. What is
These are all subtly different semantically, especially in the face of a ACTION ITEM: I'd very much like for people who use docker data containers Back to side-car containers as volumes. I could imagine something like: Pod { Sadly, there's not much of the container-ness that you can really hide, so Alternately, I could imagine just making them extra containers: Pod { It's still pretty verbose. Can we do better? In this sort of model, almost any sort of logic can be bundled up as a Something like this is, I think, the way to go. Details to emerge, input On Sat, Jan 10, 2015 at 5:42 PM, Ashley Aitken notifications@github.com
|
To follow up to myself - all of this assumes that any data mutations you Durable data is a MUCH larger topic :) On Sat, Jan 10, 2015 at 9:12 PM, Tim Hockin thockin@google.com wrote:
|
I'll try to describe a use case for a data container. I'll have a pod with The ingest container is responsible for getting messages in some fashion The process container does work, but it requires access to data provided by The data container has a process responsible for pulling the data that will Hopefully that makes some sense. Thanks On Sun, Jan 11, 2015 at 12:18 AM, Tim Hockin notifications@github.com
https://github.com/mindscratch |
To me, this does not make a strong argument. Everything you describe is Now, there IS a gotcha with the initial load of data, but that has to be The only other argument is to be exactly docker compatible semantically,
|
Using an emptydir that is available between containers sounds sufficient. After understanding "emptydir" better, I agree, the use case I provided Thanks. On Sun, Jan 11, 2015 at 1:36 PM, Tim Hockin notifications@github.com
https://github.com/mindscratch |
I really like the idea of a docker container as a volume from the standpoint of, it lets you reuse all the architecture for distributing containers. scalable doccker-registries, image caching, etc. It also lets you separate concerns. You could do a scalable static website by having a volume container with your web content in it and a main container of nginx. each pod would then have two containers as a scalable unit, and each updatable separately. |
@renewooller did you try the flex volume? https://github.com/dims/docker-flexvol |
no, I missed that. thanks for the pointer. its very interesting. :) |
Hi, I'm adding my use case as an example. docker-flexvol seems to be what I need. Using eemptyDir with copy should also work but it's nasty. We have a few single page front-end applications: compiled JS+CSS and other resources. The idea is to create a pod with a single nginx container and mount the files from the docker containers as volumes. I imagine that I am able to deploy a new version for one of my applications by changing the volume somehow and then trigger a pod restart or something. So if I have an image called
and update the pod to contain:
I expect Kubernetes will do a rolling update to the new version of my app with my-container-image:v2 @dims: Is this use case supported by your plugin? How safe is the plugin and could we see it upstream to be installed via kubeadm? I think these are questions that should be on the project issues list. |
@ieugen i believe yes, you should be able to do that. It's a pretty simple shell script so feel free to try it and let me know if you see issues. As for upstream kubeadm etc. if someone wants to take the initiative, i can help. Thanks, |
Would be interested in getting this into a helm chart somehow... I saw containerized mount utils merged but they say it doesnt work with flexVolumes. Maybe something like: with shared mount namespaces? |
alternately, if we could get a statically linked jq, we might be able to just slide in the two files directly onto the host.... |
or, I guess we could spit the difference and just run jq in a container... docker run -i --rm jq.... |
It's been a month since I started studying how to use kubernetes (properly). It didn't take me long to find pretty much all use cases mentioned in this issue on my own. In all cases it comes down to wanting to expose static files to more than one process/container. For us that means:
As I understand, the facilities that would allow this functionality are currently only properly supported by containerd(?) in the form of volumes, but it would be very helpful to have... Some cases could probably be solved by obscuring the copy action needed and ensuring it is successful. Any other (simple) solution that would allow packaging static files as a single artifact, and then use it inside of a pod/container, without copying it (with postStart commands) each time, would probably also work for most use cases mentioned in this issue. The thing is, with pipelines to build containers and registries to hold/version them all pretty much figured out, they are an extremely handy vassal for this (and in my opinion not outside their scope, it helps with the single responsibility principle). Anyway, just my 2 cents. |
I have found that recent Docker multi-stage builds pretty much allow you to do this at docker build level. |
multistage builds are not the same thing. multistage builds let you do stuff like build, throw away the build environment and copy the built artefacts to the final container. in the end, you are left with 1 container. it in this situation has for example, nginx and your static files. In the spirit of k8s composability though, the desire is to have one container for nginx that can be independently updated from a second container storing your static files. they are combined together at runtime via k8s pod semantics. This is what the issue is about. |
Yes, I agree docker images as volumes is nicer. I'm just leaving a clue, to whoever is reading this bug, how you can work around the missing feature in the meantime. |
The recent ephemeral csi volume support along with https://github.com/kubernetes-csi/csi-driver-image-populator should make this possible. :) |
@kfox1111 Cool, thanks! BTW, Google's internal composable "package" mechanism is described in this talk: |
+1 |
I have a similar requirement in my project - I'm building a platform that exposes a number of REST services for language processing. This includes tools like speech recognition which are structured as an "engine" (a few MB in size, the code that implements the service) and a separate "model" (1-5GB in size, for each specific language/domain/etc.). The two can evolve independently, and we might have three engine versions and 20 different models active at any given time. Currently we have 60 different images, and when a new engine is released we have to re-build 20 images of many GB each to embed all the different models. Ideally we'd be able to build three images for the engines and 20 images for the models, and mount the model into the engine container via the pod spec. This would make updating the engine a much lighter weight process as the nodes only have to pull one new engine image (a few MB) rather than 20 new engine+model images (potentially up to 100GB). |
UPSTREAM: <carry>: add CSI migration feature gates for GCE PD and Azure Disk
i want mount docker registry auth and certs to official registry images , i'd like put the auth and certs in another image , i don't need prepare auth and certs at local host path |
Secrets in general are better suited to be put into Kubernetes secrets then images, IMO. Easier to encrypt/audit. You can treat them like volumes. Could you try that and see if it will do what you want? |
I already commented here: #6120 But to summarize, I managed to share the container filesystem using shareProcessNamespace like this: apiVersion: apps/v1
kind: Deployment
metadata:
name: datasharing
spec:
selector:
matchLabels:
app: datasharing
template:
metadata:
labels:
app: datasharing
spec:
shareProcessNamespace: true
containers:
- name: data
image: ubuntu
imagePullPolicy: IfNotPresent
command: ["/bin/bash", "-c"]
args: ["trap 'echo signal;exit 0' SIGINT; sleep infinity"]
resources:
limits:
memory: "128Mi"
cpu: "500m"
lifecycle:
postStart:
exec:
command:
- /bin/sh
- -c
- mkdir -p /DATA && touch /DATA/empty_test_file
- name: nginx
image: nginx:stable
imagePullPolicy: IfNotPresent
resources:
limits:
memory: "128Mi"
cpu: "500m"
ports:
- containerPort: 8080
env:
- name: GET_DATA_PROC_DIR
value: 'find /proc -maxdepth 1 -type d -regex "/proc/[0-9]*" | head -2 | tail -1'
lifecycle:
postStart:
exec:
command:
- /bin/sh
- -c
- ln -s $(eval $GET_DATA_PROC_DIR)/root/DATA /mnt/DATA |
If you can put a native executable into the data container, then it's less tricky to just symlink from there via shareProcessNamespace: true
securityContext:
runAsUser: 0 # /proc/<pid>/root resolve fails if not root
volumes:
- name: share
emptyDir: {}
containers:
- name: data
image: data:1 # FROM busybox
volumeMounts:
- name: share
mountPath: /share
command:
- "/bin/sh"
- "-c"
- "ln -s /proc/$$$$/root/model /share/; sleep infinity"
- name: service
image: service:1
volumeMounts:
- name: share
mountPath: /share
readOnly: true
command:
- "/bin/sh"
- "-c"
- "while test ! -L /share/model; do sleep 1; done; exec /bin/service --data=/share/model" |
This would map closely to Docker's native volumes support, and allow people to build and version pre-baked data as containers. Maybe read-only? Haven't thought that far...
The text was updated successfully, but these errors were encountered: