Skip to content

TAS: Kueue over-subscribes Node capacities when a Node belongs to multiple ResourceFlavors #10659

@tenzen-y

Description

@tenzen-y

What happened:

When a Node belongs to the multiple ResourceFlavors like the following, Kueue TAS over-subscribes the Node capacities for Workloads while the Quota Reservation is correct.

# This Node belongs to the following both ResourceFlavors.
kind: Node
metadata:
  labels:
    example.com/machine: standard
spec:
  taints:
  - effects: NoSchedule
    key: example.com/machine
    value: standard
  # The Node doesn't have example.com/instance-type taints.
  ...
---
kind: ResourceFlavor
...
spec:
  topologyName: flat
  nodeLabels:
    example.com/machine: standard
  nodeTaints:
  - effect: NoSchedule
    key: example.com/machine
    value: standard
  - effect: NoSchedule
    key: example.com/instance-type
    value: partial-reserved
---
kind: ResourceFlavor
...
spec:
  topologyName: flat
  nodeLabels:
    example.com/machine: standard
  nodeTaints:
  - effect: NoSchedule
    key: example.com/machine
    value: standard

What you expected to happen:

There is not over subscription in TAS.

How to reproduce it (as minimally and precisely as possible):
I reproduced this problem in the following scheduler UT case.
In this scenario, the partial-reserved-pending wl should get Node x2 Topology assignment because Node x1 has already been occupied by the ondemand-admitted-a and ondemand-admitted-b wls.

However, the current partial-reserved-pending wl topology assignment will be Node x2 (oversubscription).

#10657
https://github.com/tenzen-y/kueue/blob/6523d77d8534932e91255d25c7333478103d3254/pkg/scheduler/scheduler_tas_test.go#L2391-L2569

Anything else we need to know?:

Again, the Quota Reservation is correct.

Environment:

  • Kubernetes version (use kubectl version):
  • Kueue version (use git describe --tags --dirty --always):
  • Cloud provider or hardware configuration:
  • OS (e.g: cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

Metadata

Metadata

Assignees

Labels

kind/bugCategorizes issue or PR as related to a bug.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions