Simplify flexible versions by vmaurin · Pull Request #1139 · aio-libs/aiokafka

vmaurin · 2025-11-27T20:34:28Z

The flexible versions is a protocol specificity for newer versions of the API. When an API is flexible, it is using more compact structures and also allow additional "dynamic" fields that could be added without the need to introduce a new API versions.

This commit move the flexible versions support to the protocol layer, so it is more transparent and easy when defining Struct classes and schemas.

When defining the schema, we can specify a tagged field with a tuple containing the field name and the field tag.

Checklist

I think the code is well written
Unit tests for the changes exist
Documentation reflects the changes
Add a new news fragment into the CHANGES folder
- name it <issue_id>.<type> (e.g. 588.bugfix)
- if you don't have an issue_id change it to the pr id after creating the PR
- ensure type is one of the following:
  - .feature: Signifying a new feature.
  - .bugfix: Signifying a bug fix.
  - .doc: Signifying a documentation improvement.
  - .removal: Signifying a deprecation or removal of public API.
  - .misc: A ticket has been closed, but it is not of interest to users.
- Make sure to use full sentences with correct case and punctuation, for example: Fix issue with non-ascii contents in doctest text files.

    @classmethod
    @abc.abstractmethod
-    def encode(cls, value: T) -> bytes: ...
+    def encode(cls, value: T, flexible: bool) -> bytes: ...


    @classmethod
    @abc.abstractmethod
-    def decode(cls, data: BytesIO) -> T: ...
+    def decode(cls, data: BytesIO, flexible: bool) -> T: ...


codecov · 2025-11-27T20:40:12Z

Codecov Report

❌ Patch coverage is 97.81022% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 95.23%. Comparing base (00b099f) to head (7c67eba).

Files with missing lines	Patch %	Lines
aiokafka/protocol/message.py	75.00%	2 Missing ⚠️
aiokafka/protocol/types.py	98.68%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1139      +/-   ##
==========================================
+ Coverage   94.97%   95.23%   +0.26%     
==========================================
  Files          89       89              
  Lines       16041    15987      -54     
  Branches     1397     1387      -10     
==========================================
- Hits        15235    15226       -9     
+ Misses        556      516      -40     
+ Partials      250      245       -5

Flag	Coverage Δ
cext	`95.20% <97.81%> (+0.26%)`	⬆️
integration	`95.12% <97.81%> (+0.27%)`	⬆️
purepy	`95.20% <97.81%> (+0.26%)`	⬆️
unit	`52.53% <96.35%> (+0.18%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

vmaurin · 2025-12-01T09:15:41Z

@ods Let me know if you need additional info. The main reason I would need this to improve the API version coverage is to be able to support/specify tagged field properly. I took inspiration from the java client JSON format, where you give the tag of a field along with the name, like here https://github.com/apache/kafka/blob/trunk/clients/src/main/resources/common/message/ApiVersionsResponse.json#L64

ods · 2025-12-02T19:37:54Z

@vmaurin Thank you for the contribution, and sorry for the delay. I’m not familiar enough with this code for quick answer, so I need to find some time for research.

vmaurin · 2025-12-02T20:27:40Z

@vmaurin Thank you for the contribution, and sorry for the delay. I’m not familiar enough with this code for quick answer, so I need to find some time for research.

No problem @ods My overall goal here is to have something closer to the java client for schemas definitions. In java client, they have these extended json format (one per API request, one per API response) that are then used to generate a java classes. Being in Python, it is probably better to express schema in Python, and we don't really need the code generation as we have the class level facilities.
My issues with the current implementation of flexible versions/tagged fields:

compact structure need to be explicitly declared, while a boolean saying "use compact structure" should be enough to properly encode and decode "normal" type in schema (String, Arrays, Bytes)
tagged fields are not meant to be used passing a dict. They should be treated as "normal" property of API, but it allows API versions to be forward compatible, just ignoring new tagged fields
I am also fixing a bug serializing tagged fields (serializing the size was missing)

ods · 2025-12-07T16:33:36Z

+                UnsignedVarInt32.encode(0)
+                if flexible and self.allow_flexible
+                else Int16.encode(-1, flexible)


The protocol everywhere specify either STRING or COMPACT_STRING. Why do we switch inside of the single class based on property which is not directly related?

I took inspiration from the java client schema's json files like here https://github.com/apache/kafka/blob/trunk/clients/src/main/resources/common/message/FindCoordinatorRequest.json#L36

When you specify the schema, it is easier and less to just say "it is a String" and mark the flexible versions rather than having to remember it is a more compact version everywhere. The same of avoiding at each level of schemas to specify it can accept flexible fields.

For flexible fields, like in the java client json files, it is easier to declare it as other fields, with a name and type + the additional tag id, rather than declaring a generic structure on every structs and then having an extra layer of serialization on top

ods · 2025-12-07T16:35:04Z

+                ("name", String("utf-8")),
                (
                    "partitions",
-                    CompactArray(


From code here it's not obvious if correct (compact) form will be used

Related to #1139 (comment)

In the java client json, you can see they just say "it is an array" https://github.com/apache/kafka/blob/trunk/clients/src/main/resources/common/message/AlterPartitionReassignmentsResponse.json#L36

Then, it is because the version is marked "flexible" that it is using the more compact serialization

sheinbergon · 2026-01-27T07:22:49Z

@vmaurin @ods this is a blocker towards kafka 4.x compatibility right? anything I can do to help here?

vmaurin · 2026-01-27T07:37:15Z

@vmaurin @ods this is a blocker towards kafka 4.x compatibility right? anything I can do to help here?

Yes and no. Current version should be compatible with 4.x as main kafka project rollbacked their deprecation plans. Still, they might plan to deprecated some versions of message in future, so we should try to be up to date.

About this MR:
There is already a flexible field implementations in the current master branch, but it has a bug + it is not very convenient to define the message schemas. My idea with this MR is to make the flexible fields easy to define and use, similar to what it is done in the official java client json "schemas". It was "ready" to go, but it seems there was some test failures on the latest rebase I did (not sure then if it is flaky tests or a real issue)

The flexible versions is a protocol specificity for newer versions of the API. When an API is flexible, it is using more compact structures and also allow additional "dynamic" fields that could be added without the need to introduce a new API versions. This commit move the flexible versions support to the protocol layer, so it is more transparent and easy when defining Struct classes and schemas. When defining the schema, we can specify a tagged field with a tuple containing the field name and the field tag.

vmaurin · 2026-04-23T11:54:07Z

@ods A small reminder about this one, let me know if I should make it ready to merge again (it is a bit outdated)

About this MR: There is already a flexible field implementations in the current master branch, but it has a bug + it is not very convenient to define the message schemas. My idea with this MR is to make the flexible fields easy to define and use, similar to what it is done in the official java client json "schemas".

vmaurin force-pushed the simplify_flexible_versions branch from 9289e11 to 8622716 Compare November 27, 2025 20:35

github-advanced-security AI found potential problems Nov 27, 2025

View reviewed changes

vmaurin marked this pull request as draft November 27, 2025 20:37

vmaurin force-pushed the simplify_flexible_versions branch 3 times, most recently from 31b6d5a to b38742d Compare November 27, 2025 21:27

vmaurin mentioned this pull request Nov 27, 2025

[QUESTION] Support for Apache Kafka 4.0 #1085

Open

vmaurin marked this pull request as ready for review November 27, 2025 21:46

vmaurin force-pushed the simplify_flexible_versions branch from b38742d to c9d49dd Compare December 1, 2025 09:11

ods reviewed Dec 7, 2025

View reviewed changes

Comment thread aiokafka/protocol/api.py

vmaurin force-pushed the simplify_flexible_versions branch 2 times, most recently from ac55e29 to 7880f3d Compare December 12, 2025 08:43

vmaurin force-pushed the simplify_flexible_versions branch from 7880f3d to 0655add Compare January 5, 2026 14:27

vmaurin force-pushed the simplify_flexible_versions branch from 0655add to 3a7003e Compare February 2, 2026 11:15

Merge branch 'aio-libs:master' into simplify_flexible_versions

7c67eba

Uh oh!

Conversation

vmaurin commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Uh oh!

Check notice

Check notice

codecov Bot commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

vmaurin commented Dec 1, 2025

Uh oh!

ods commented Dec 2, 2025

Uh oh!

vmaurin commented Dec 2, 2025

Uh oh!

ods Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

vmaurin Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

ods Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

vmaurin Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sheinbergon commented Jan 27, 2026

Uh oh!

vmaurin commented Jan 27, 2026

Uh oh!

vmaurin commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

vmaurin commented Nov 27, 2025 •

edited

Loading

codecov Bot commented Nov 27, 2025 •

edited

Loading