-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify caching configuration #141
Comments
We have the same issue and attempted to workaround it by using an init container to create the cache directory on the node like in the following example: (I didn't provide the pv config in this example, but it was configured to cache dir on
This example DOES NOT work - as k8s attempts to mount the s3 volume even before the init container. |
Based on the example here: I worked around this issue by using a hostPath mount to create the directory (if not exist) on host. Regardless of the order of volumeMount or volumes, it will automatically retry until it is mounted. But I put it before the pvc mount in case it does this in the order specified. From my testing, pod comes up immediately.
|
I'm working around this using a k8s job. apiVersion: batch/v1
kind: Job
metadata:
name: s3-cache-create
namespace: kube-system
spec:
template:
spec:
containers:
- name: busybox
image: busybox
command:
- mkdir
- "-p"
- /host/var/tmp/s3-cache
volumeMounts:
- name: host-var-tmp
mountPath: /host/var/tmp
volumes:
- name: host-var-tmp
hostPath:
path: /var/tmp
restartPolicy: Never A job per volume is needed - and you should modify the path so that it is unique per volume. |
This worked for me apiVersion: apps/v1
kind: DaemonSet
metadata:
name: s3-cache-dir-setup
namespace: kube-system
spec:
selector:
matchLabels:
app: s3-cache-dir-setup
template:
metadata:
labels:
app: s3-cache-dir-setup
spec:
initContainers:
- name: create-s3-cache-dir
image: busybox
command:
- sh
- -c
- |
mkdir -p /tmp/s3-local-cache && \
chmod 0700 /tmp/s3-local-cache
securityContext:
privileged: true
volumeMounts:
- name: host-mount
mountPath: /tmp/s3-local-cache
containers:
- name: pause
image: k8s.gcr.io/pause:3.1
volumes:
- name: host-mount
hostPath:
path: /tmp/s3-local-cache |
From the documentation
If this is the case, we'd need unique cache directory per pod, say there is more than one pod of the same deployment scheduled on the same node. Looks like none of the workarounds suggested above supports this scenario. |
I worked through different ways of creating a host path for the project I work on. Using a provisioner for the nodes such as setting up scripts with the Karpenter EC2 Class only works if the paths used are known ahead of time. If you are scaling your buckets up and down independent of the lifetime of the node then it is impossible to create all the directories ahead of time like this. Using a DaemonSet to mount the Using apiVersion: v1
kind: PersistentVolume
metadata:
name: default-bucket-cache
spec:
storageClassName: manual
accessModes:
- ReadWriteMany
hostPath:
path: /tmp/cache-default-bucket
type: DirectoryOrCreate
capacity:
storage: 500Mi
claimRef:
namespace: default
name: bucket-cache
apiVersion: v1
kind: PersistentVolumeClaim
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
namespace: default
name: bucket-cache
spec:
storageClassName: manual
resources:
requests:
storage: 500Mi
volumeName: bucket-cache
accessModes:
- ReadWriteMany It would be nice if there was a way to simply specify a pv for the mountpoint to use. I am thinking specifically for putting the cache somewhere else other than the host that way it could be reused across multiple nodes. |
To add on to these points I would have liked for the driver to expose k8s specific caching configuration which is interpreted by the driver prior to starting a mountpoint process. This would include creating a cache directory on the node specifically for the mount being prepared, and at the path given in the configuration. Even more ideal would be the ability to use an EBS (or other) backed volume as the cache so that normal node operations aren't able to be compromised via low disk space, but this poses some implementation questions. Perhaps #279 can offer a solution to this to run mountpoint in a sidecar. |
Hi @tvandinther, thanks for your interest in this feature. We're working on prioritising this, and we'll give an update on Github when we have something to share. |
/feature
Is your feature request related to a problem? Please describe.
Caching is supported today by adding a
cache
option to a persistent volume configuration and passing in a directory on the node's filesystem. This works, but comes with a couple sharp edges. Creating the directory on the node is not done automatically, so it has to be created manually ahead of time.Describe the solution you'd like in detail
Caching configuration should be possible without manually making changes to the nodes and should make it easy to define different types of storage to use as cache like a ramdisk.
Describe alternatives you've considered
One potential solution is to reference other persistent volumes or mounts as cache, which could make for nice composability of the k8s constructs.
Additional context
Mountpoint's documentation on caching: https://github.com/awslabs/mountpoint-s3/blob/main/doc/CONFIGURATION.md#caching-configuration
The text was updated successfully, but these errors were encountered: