The task-topology plugin does not implement DeallocateFunc, which may cause the scheduling result to be unexpected #4003

liuyuanchun11 · 2025-02-11T02:40:50Z

Description

The plugin task-topology implements the AllocateFunc (resource allocation function) but does not implement the DeallocateFunc (resource deallocation function). If a statement.Discard operation occurs during the scheduling process, the actions performed in allocateFunc cannot be rolled back. When there are multiple pending jobs to be scheduled within a single openSession, this may lead to scheduling results that deviate from expectations.

Steps to reproduce the issue

1.Enable the task-topology plug-in
2.Create two jobs: Job-A contains 2 pods, but the current cluster's available resources can only fulfill the request for 1 pod. Job-B contains 1 pod, and the available resources are sufficient to satisfy Job-B's request.Job-A has a higher priority than Job-B.
3.After scheduling begins, Job-A first schedules one pod. However, since the cluster resources are insufficient to fulfill the second pod's request, this triggers a statement.Discard operation to roll back the allocation. Because the DeallocateFunc is not implemented, the JobManager.TaskBound state (or resource binding) for Job-A is not rolled back. This incomplete rollback may block or interfere with the subsequent scheduling of Job-B, even though Job-B's resource requirements could otherwise be met.

Describe the results you received and expected

The task-topology plugin needs to be implemented in addition to deallocate

What version of Volcano are you using?

volcano 1.10

Any other relevant information

No response

liuyuanchun11 added the kind/bug Categorizes issue or PR as related to a bug. label Feb 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The task-topology plugin does not implement DeallocateFunc, which may cause the scheduling result to be unexpected #4003

The task-topology plugin does not implement DeallocateFunc, which may cause the scheduling result to be unexpected #4003

liuyuanchun11 commented Feb 11, 2025

The task-topology plugin does not implement DeallocateFunc, which may cause the scheduling result to be unexpected #4003

The task-topology plugin does not implement DeallocateFunc, which may cause the scheduling result to be unexpected #4003

Comments

liuyuanchun11 commented Feb 11, 2025

Description

Steps to reproduce the issue

Describe the results you received and expected

What version of Volcano are you using?

Any other relevant information