-
Notifications
You must be signed in to change notification settings - Fork 218
PMM-14661 Fix QAN crashes, make inserts asynchronous #5499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 2 commits
909b12d
025ba18
f911a2e
1434e43
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -25,6 +25,12 @@ import ( | |
|
|
||
| const queryTimeout = 30 * time.Second | ||
|
|
||
| // MaxParallelQueries bounds the number of ClickHouse queries a single request may | ||
| // run concurrently. It is kept well below the connection pool size (see maxOpenConns | ||
| // in main.go) so one request cannot monopolize the pool — which is shared with the | ||
| // data ingestion writer — or flood ClickHouse with concurrent scans. | ||
| const MaxParallelQueries = 4 | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. With this is maxOpenConns enough? It is 10 and shared with the ingest writer, report Select(), filter queries, and up to 4 parallel sparklines per report. I believe in this case few concurrent QAN users could exhaust the pool and time out.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good catch — this is moot now. I've removed the sparkline parallelization (and Running several sparkline aggregations concurrently multiplies peak ClickHouse memory per report, which works against the OOM/crash fix this PR targets — most acutely on the low-memory profile — and it coupled read concurrency to the shared pool (a burst of reports could starve the single ingest writer). Sparklines are serial again and The report-latency win will be revisited separately, likely via a pre-aggregated rollup rather than more concurrency. |
||
|
|
||
| var sparklinePointAllFields = []string{ | ||
| "point", | ||
| "timestamp", | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.