Skip to content

[MultiKueue] Check improvability of remote client caching #10629

@olekzabl

Description

@olekzabl

(I hesitated between "cleanup", "bug" and "feature request" :) Picked the least prominent one)

MultiKueue controllers (on the manager cluster) fetch K8s objects from workers using remote clients.
Let's focus on Workloads and generic job kinds, as these can grow largest, in terms of QPS or volume.
For those types, MultiKueue does the following data fetches:

  • A .Watch call here (for Workloads when called from here, and for generic jobs when called from here).
    Used to detect any changes, and trigger MK Workload Reconciler on them.
  • A .Get call here (for Workloads).
    Used to reconcile a MK Workload.
  • A .List call here (for Workloads).
    Used for GC, so not very often (and can be completely opted out of via Kueue Config).

Notably, the .Get and the .List use a remote client which is uncached.
(Evidence: it's built here, with Options.Cache unset, which calls this -> this -> this -> this where empty Cache is not overridden).

IIUC this leads to potentially wasteful remote fetches:

  • whenever the .Watch notifies about a remote resource change, we're given the full object (e.g. here), which we even use in the MK job adapters...
  • ... but then, if it's a Workload, we drop that full object, just to fetch it again in the .Get call.

Rough thoughts on how we could address this:

  1. Just enabling caching could lead to memory outages for setups with lots of Workloads.
    We'd need a way to give up the cache if it's grown too large.
    Not sure if such a mechanism is easily available in K8s libraries.

  2. Even better, given that we typically need .Get, not .List, I'd intuitively seek for a mature "bounded buffer" cache, e.g. with "Least Recently Used" per-item eviction.
    This could work poorly for .List calls, but would still offer some balance between "memory usage" and "not making wasteful API calls" for .Gets, and that's what we mostly care for here.
    Though here, it's even less clear if this can be easily done in K8s.

  3. The default K8s client cache is based on an internal watch, for which I don't see a way to expose its change notifications to the caller.
    This means that, if we use this (say, with some opt-out on memory pressure, as in # 1), we'd in turn most likely duplicate the "watch" API call.
    Not sure how bad this would be.
    If it would, then one way around could be using NewSharedInformerFactory (thus getting access to the underlying informers?) - though I acknowledge this is not used in Kueue so far.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/multikueueIssues or PRs related to MultiKueuekind/cleanupCategorizes issue or PR as related to cleaning up code, process, or technical debt.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions