Skip to content

Add maxNodes check for MachineDeployments#79

Open
parthyadav3105 wants to merge 1 commit intokubernetes-sigs:mainfrom
parthyadav3105:max-size-node-limit-annotation
Open

Add maxNodes check for MachineDeployments#79
parthyadav3105 wants to merge 1 commit intokubernetes-sigs:mainfrom
parthyadav3105:max-size-node-limit-annotation

Conversation

@parthyadav3105
Copy link
Copy Markdown
Contributor

Enforce cluster.x-k8s.io/cluster-api-autoscaler-node-group-max-size annotation to ensure that MachineDeployment replica count never exceeds max-size annotation.

This allows putting a cap on MachineDeployment max nodes.

Closes #77

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: parthyadav3105
Once this PR has been reviewed and has the lgtm label, please assign neolit123 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Apr 14, 2026
Copy link
Copy Markdown
Collaborator

@elmiko elmiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in general this is looking great, thank you @parthyadav3105. i have a question but it's not a blocker.

Comment thread pkg/cloudprovider/cloudprovider.go Outdated
}
if !isBelowMaxSize(machineDeployment) {
return nil, nil, cloudprovider.NewInsufficientCapacityError(fmt.Errorf("MachineDeployment %q has reached its maximum size", machineDeployment.Name))
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given that we might return multiple MachineDeployments and we pick the first one from the list, i wonder if we should have some sort of loop here to check if there are other MachineDeployments we could attempt? (similar to what the TODO on L328 is talking about)

i don't think we need this behavior for the first version of this, but i wonder if we should add it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I’ll look into where to add it. I think filterCompatibleInstanceTypes() might be the right place, while slices.SortFunc() can remain for additional customization on top of the core Karpenter cheapest sorted list.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it seems too big, i'm ok with leaving the functionality as it is. but, i'm happy to hear your investigation as well.

Copy link
Copy Markdown
Contributor Author

@parthyadav3105 parthyadav3105 Apr 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be repeating what is already known to cover my understanding:

  • From what I understand(github.com/kubernetes-sigs/karpenter/pkg/controllers/nodeclaim/lifecycle/launch.go#L78-L88), when cloud-provider returns InsufficientCapacityError, then core karpenter considers it as failed NodeClaim and deletes that NodeClaim. In the next cycle, it fetches fresh batch of pods in pending state, do calculations and creates new fresh NodeClaim request again. This also gives cloud-providers a way to protect when quota limit or cloud-capacity is hit.
  • Meanwhile, cloud-provider is expected to keep reporting latest instance-type/offering metadata via GetInstanceTypes() to Karpenter, so that Karpenter filters out those instance types in the next cycle and only constructs the NodeClaim with valid requirements that are available: true.

With current code:

// identify which fit requirements
compatibleInstanceTypes := filterCompatibleInstanceTypes(instanceTypes, nodeClaim)
if len(compatibleInstanceTypes) == 0 {
	return nil, nil, cloudprovider.NewInsufficientCapacityError(fmt.Errorf("cannot satisfy create, no compatible instance types found"))
}

// TODO (elmiko) if multiple instance types are found to be compatible we need to select one.
// for now, we sort by resource name and take the first in the list. In the future, this should
// be an option or something more useful like minimum size or cost.
slices.SortFunc(compatibleInstanceTypes, func(a, b *ClusterAPIInstanceType) int {
	return cmp.Compare(strings.ToLower(a.Name), strings.ToLower(b.Name))
})
selectedInstanceType := compatibleInstanceTypes[0]

In first layer, we have filterCompatibleInstanceTypes() which returns InsufficientCapacityError. Here we eliminate all those instance that cannot be provisioned or used for these requirements. If no combination fits, we return InsufficientCapacityError.

This makes first layer most suitable for us to add this, here we can eliminate MachineDeployment that have maxed out and have no capacity. If all Machine Deployment are eliminated then we return InsufficientCapacityError.

Meanwhile, the SortFunc is the second layer which says, now from what is possible, choose the cheapest Machine Deployment(or any customised behaviour via annotations as well).

Correction:

Makes sense. I’ll look into where to add it. I think filterCompatibleInstanceTypes() might be the right place, while slices.SortFunc() can remain for additional customization on top of the core Karpenter cheapest sorted list. (we get NodeClaim requirements in cloudprovider.Create())


Suggested Changes:

    ...

    func (c *CloudProvider) createMachine(ctx context.Context, nodeClaim *karpv1.NodeClaim) (*capiv1beta1.MachineDeployment, *capiv1beta1.Machine, error) {
        ...
        ...
   	    // identify which fit requirements
	    compatibleInstanceTypes := filterCompatibleInstanceTypes(instanceTypes, nodeClaim)
	    if len(compatibleInstanceTypes) == 0 {
	    	return nil, nil, cloudprovider.NewInsufficientCapacityError(fmt.Errorf("cannot satisfy create, no compatible instance types found"))
	    }
	    ...
	    // once scalable resource is identified, increase replicas
	    machineDeployment, err := c.machineDeploymentProvider.Get(ctx, selectedInstanceType.MachineDeploymentName, selectedInstanceType.MachineDeploymentNamespace)
	    if err != nil {
	    	return nil, nil, fmt.Errorf("cannot satisfy create, unable to find MachineDeployment %q for InstanceType %q: %w",     selectedInstanceType.MachineDeploymentName, selectedInstanceType.Name, err)
	    }
-	    if !isBelowMaxSize(machineDeployment) {
-	    	return nil, nil, cloudprovider.NewInsufficientCapacityError(fmt.Errorf("MachineDeployment %q has reached its maximum size", machineDeployment.Name))
-	    }
	    originalReplicas := *machineDeployment.Spec.Replicas
	    ...
	    ...
	}



	func filterCompatibleInstanceTypes(instanceTypes []*ClusterAPIInstanceType, nodeClaim *karpv1.NodeClaim) []*ClusterAPIInstanceType {
		reqs := scheduling.NewNodeSelectorRequirementsWithMinValues(nodeClaim.Spec.Requirements...)
		filteredInstances := lo.Filter(instanceTypes, func(i *ClusterAPIInstanceType, _ int) bool {
			// TODO (elmiko) if/when we have offering availability, this is a good place to filter out unavailable instance types
			return reqs.Compatible(i.Requirements, scheduling.AllowUndefinedWellKnownLabels) == nil &&
				resources.Fits(nodeClaim.Spec.Resources.Requests, i.Allocatable()) &&
+				i.Offerings.Available().HasCompatible(reqs)
		})

		return filteredInstances
	}

Note: Offerings is []*Offering, as of today there is only one offering per MachineDepoyment (set in machineDeploymentToInstanceType()) which might change in future.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That approach makes sense to me and aligns with upstream which I like. +1 to filtering in i.Offerings.Available().HasCompatible(reqs).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.

@elmiko
Copy link
Copy Markdown
Collaborator

elmiko commented Apr 16, 2026

/assign @maxcao13

@maxcao13
Copy link
Copy Markdown
Contributor

/ok-to-test

Trying to see if this works automatically.

@k8s-ci-robot k8s-ci-robot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Apr 16, 2026
Enforce "cluster.x-k8s.io/cluster-api-autoscaler-node-group-max-size"
annotation to ensure that MachineDeployment replica count never
exceeds max-size annotation.

This allows putting a cap on MachineDeployment max nodes.

Signed-off-by: Parth Yadav <parth@coredge.io>
@parthyadav3105 parthyadav3105 force-pushed the max-size-node-limit-annotation branch from 33bc5a5 to 526ebcc Compare April 22, 2026 19:16
@maxcao13
Copy link
Copy Markdown
Contributor

/lgtm

Thanks! Guess the automation for ok-to-test doesn't work, I can try to figure that out though.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support max-size annotation for per-MachineDeployment node limits

4 participants