manual state change should not use fork-execute model on scheduler#65677
Draft
mobuchowski wants to merge 3 commits intoapache:mainfrom
Draft
manual state change should not use fork-execute model on scheduler#65677mobuchowski wants to merge 3 commits intoapache:mainfrom
mobuchowski wants to merge 3 commits intoapache:mainfrom
Conversation
Signed-off-by: Maciej Obuchowski <maciej.obuchowski@datadoghq.com>
107b18f to
e0aa3e6
Compare
- Assigning isoformat() string to datetime-typed variable confused mypy now that the assignment is in the outer method scope (previously inside a nested closure where inference behaved differently). - ti.operator is str | None; guard against None before .lower(). Signed-off-by: Maciej Obuchowski <maciej.obuchowski@datadoghq.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When the Airflow scheduler processes externally-changed task states (orphaned Celery TIs adopted after a scheduler restart, UI/API state changes routed through
process_executor_events → handle_failure), the OpenLineage listener callsos.fork()in_fork_executeto emit the FAIL/COMPLETE event out-of-band.The forked child inherits the scheduler's SSL-wrapped Postgres connection pool and, because the AF3+ branch skipped
configure_orm(disable_connection_pool=True)(guarded byif not AIRFLOW_V_3_0_PLUS:from #47580 to avoid crashing on the worker'sairflow-db-not-allowed:///sentinel URL), the child issues DB queries over the same TLS socket as the parent, potentially desynchronizing the OpenSSL sequence counter and crashing the scheduler's very nextsession.flush()withpsycopg2.OperationalError: SSL error: decryption failed or bad record mac. This happened in our environment.This PR routes the scheduler-side "manual state change" emission through the existing
ProcessPoolExecutorthat DAG-run listeners already use (workers are initialized once via_executor_initializer, never share connections with the scheduler, and don't fork per event).Was generative AI tooling used to co-author this PR?
Generated-by: Claude Opus 4.7 following the guidelines