Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement MPI Job clean-up Cron Job #40

Open
KrisWilliamson opened this issue Jan 9, 2024 · 4 comments
Open

Implement MPI Job clean-up Cron Job #40

KrisWilliamson opened this issue Jan 9, 2024 · 4 comments
Assignees

Comments

@KrisWilliamson
Copy link
Contributor

Continuation of #37

@KrisWilliamson
Copy link
Contributor Author

KrisWilliamson commented Jan 9, 2024

Proposed implementation

---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  namespace: openmpp-uat
  name: mpi-cleanup
rules:
- apiGroups:
  - extensions
  - apps
  resources:
  - deployments
  - replicasets
  verbs:
  - 'patch'
  - 'get'
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: mpi-cleanup
  namespace: openmpp-uat
subjects:
- kind: ServiceAccount
  name: sa-mpi-cleanup
  namespace: openmpp-uat
roleRef:
  kind: Role
  name: mpi-cleanup
  apiGroup: ""
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: sa-mpi-cleanup
  namespace: openmpp-uat
---
apiVersion: batch/v1
kind: CronJob
metadata: 
name: mpiCleanup
namespace:  openmpp-uat
spec:
  schedule: "* 6 * * 0"
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccountName: sa-mpi-cleanup
          containers:
          - name: hello
            image: busybox:1.28
            imagePullPolicy: IfNotPresent
            command:
            - /bin/sh
            - -c
            - kubectl get mpijobs -o go-template --template '{{range .items}}{{.metadata.name}} {{.metadata.creationTimestamp}}{{"\n"}}{{end}}' | awk '$2 <= "'$(date -d'now-24 hours' -Ins --utc | sed 's/+0000/Z/')'" { print $1 }' | xargs --no-run-if-empty kubectl delete mpijob
          restartPolicy: OnFailure

There are placeholders in the above example, such as Service account sa-mpi-cleanup and the roles, etc.

Also a decision will need to be made on when the cron job is to run and how old the jobs have to be to be cleaned up (currently once a week and 24 hours old)

@Souheil-Yazji
Copy link
Contributor

This is good, but we don't want this solution to be limited to a namespace, rather it should be cluster-wide.

@KrisWilliamson
Copy link
Contributor Author

Will do.
Also, can I get feedback on how often this should be run (daily, weekly?) and how old the jobs should be before they are deleted (1 day, 1 week, something else?)

@Souheil-Yazji
Copy link
Contributor

We can run the job at midnight and delete any MPI jobs older than 7 days. We can start with that and if needed modify later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants