fix: relax JSON normalizer type validation#3906
Conversation
|
@rooperuu thanks a lot for posting your fix. There's a couple of things I would want to do differently when incorporating this into |
|
@Travior Thanks for taking a look! I think we need this for our use case, so I can spend some time on this, or whichever is easiest. Let me know what you would like to change and I can do changes accordingly. |
Travior
left a comment
There was a problem hiding this comment.
In my mind we can have a bit cleaner separation of concerns.
I think ensure_this_normalizer should be the place where we check normalizer compatibility (we should probably treat subclass == compatible for now). Then we need minimal to no changes in extraction or schema handling
| @classmethod | ||
| def ensure_this_normalizer(cls, norm_config: TJSONNormalizer) -> None: | ||
| # make sure schema has right normalizer | ||
| present_normalizer = norm_config["module"] | ||
| if present_normalizer != cls.__module__: | ||
| raise InvalidJsonNormalizer(cls.__module__, present_normalizer) |
There was a problem hiding this comment.
In my mind we should check the "shape" of the normalizer here instead of rigidly checking the module for equality
There was a problem hiding this comment.
I switched to an import and subclass check rather than direct string comparison. Now Superclass.ensure_this_normalizer(subclass_config) should always pass.
|
|
||
| try: | ||
| DataItemNormalizer.ensure_this_normalizer(item_normalizer) | ||
| if issubclass(json_module.DataItemNormalizer, DataItemNormalizer): |
There was a problem hiding this comment.
This should keep the ensure_this_normalizer check.
| @root_key.setter | ||
| def root_key(self, value: bool) -> None: | ||
| data_normalizer = self._schema.data_item_normalizer | ||
| assert isinstance(data_normalizer, RelationalNormalizer) |
There was a problem hiding this comment.
we don't need this assertion if we correctly validate in dlt.common.normalizers.json.relational
|
@Travior Thanks for the review. I think what you're suggesting makes sense. I changed the implementation so that Edit: It seems a similar fix has already been applied in #3808. Maybe this PR can be scrapped. |
Description
Support various operations that previously only worked on the
dlt.common.normalizers.json.relationalnormalizer. This includesdlt.common.normalizers.json.relational_no_coercionor any derived custom normalizers such as:This also fixes
max_nestingconfiguration on these normalizer subtypes.Related Issues