Skip to content

Prevent concurrent object access#426

Draft
yhabteab wants to merge 4 commits into
do-not-persist-events-to-databasefrom
prevent-concurrent-object-access
Draft

Prevent concurrent object access#426
yhabteab wants to merge 4 commits into
do-not-persist-events-to-databasefrom
prevent-concurrent-object-access

Conversation

@yhabteab
Copy link
Copy Markdown
Member

@yhabteab yhabteab commented May 20, 2026

This PR restructures the Object and Incident lookups and creations a bit to prevent some of the issues we were having as described in the referenced issues. Previously, the event processing logic goes like this:

  1. With each ongoing request, we first sync the Object into the database in its own transaction. While this is happening, the entire objects cache is locked, so any other request that targeting a completely different object will be blocked until the transaction is committed.
  2. After the object is successfully synced, we then proceed with the rest of the processing, which includes syncing the Incident and all the related records in a single and long transaction. And of course, while this is happening, the incident object is locked, so any other request that targeting the same incident will be blocked until the ongoing one completes.

This PR changes the logic to the following:

  1. With each ongoing request, we first just retrieve the corresponding object from cache, and if it doesn't exist yet, we simply create a new one in memory without saving it to the database. This way, the objects cache is never locked across database transactions, and any request that targeting a completely different object can proceed promptly without being blocked.
  2. After the object is retrieved or created in memory, we then use that object to lookup or create the corresponding incident in the incidents cache. Only when enter the incident.ProcessEvent function, and acquire the lock for that incident, we will then start a DB tx to sync the incident and all the related records just like before, but this time, the object syncing is included in this very same tx. When we receive another request that targeting the same object, it will be blocked in incident.ProcessEvent until the ongoing one completes, but other requests can freely proceed without such blocking.

There is one difference between the old and new logic that is worth mentioning. Previously, if we fail to successfully sync the object into the database, it wasn't added to the cache at all, as the cache is only updated after the tx is committed. That's not the case anymore, but since incidents are created/cached in the same way (even with the main branch), I don't have any better way to handle this without introducing more complexity. If in doubt, we can always add some cleanup logic or the like to remove dead objects and incidents from the cache in a later/separate PR if we find it necessary.

resolves #250
resolves #266

Blocked By

yhabteab added 3 commits May 20, 2026 11:39
We don't need to reload them that early as the request can still fail
before even starting to process it in `incident#ProcessEvent`, so move
the incident recipients restoration code into `ProcessEvent` to avoid
superfluous DB query in error cases.
The built-in `delete` function can handle nil values gracefully, so the
extra nil check and map lookup aren't necessary.
@cla-bot cla-bot Bot added the cla/signed CLA is signed by all contributors of a PR label May 20, 2026
@yhabteab yhabteab marked this pull request as draft May 20, 2026 10:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla/signed CLA is signed by all contributors of a PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant