Centralize posix_madvise wrapper with errno-aware severity#1109
Open
KimYannn wants to merge 1 commit into
Open
Centralize posix_madvise wrapper with errno-aware severity#1109KimYannn wants to merge 1 commit into
KimYannn wants to merge 1 commit into
Conversation
Five call sites previously logged every posix_madvise failure at
Debug::ERROR severity, even though the API is purely advisory and
every site falls through to continue. On ARM64 kernels with 64KB
pages, a benign tail-page alignment mismatch on mmap'd databases
produces user-visible "posix_madvise returned an error" lines via
stdout/stderr capture in downstream services (e.g. mmseqs gpuserver),
suggesting a fault where none exists.
Introduce Util::madviseLogged(addr, len, advice, context) that:
- returns 0 when len == 0 or HAVE_POSIX_MADVISE is undefined
- logs with strerror() and the named advice on failure
- chooses severity by category:
* SEQUENTIAL hints -> WARNING (always non-functional)
* WILLNEED + EINVAL -> WARNING (alignment / VMA type edge)
* WILLNEED + other errno -> ERROR (EIO / EBADF / ENOMEM)
Fallback POSIX_MADV_* macros are provided in the header so call
sites compile cleanly when HAVE_POSIX_MADVISE is undefined; the
helper is a no-op in that configuration.
Refactor all five sites onto the helper:
- src/commons/Util.cpp (touchMemory)
- src/commons/DBReader.cpp (setSequentialAdvice)
- src/linclust/KmerIndex.h (KmerIndex load)
- src/linclust/kmermatcher.cpp (mergeKmerFilesAndOutput)
- src/prefiltering/Prefiltering.cpp (mergeTargetSplits)
No functional behavior change beyond severity classification and
log message format (now includes advice name and strerror text).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Five call sites in MMseqs2 log every
posix_madvisefailure atDebug::ERRORseverity, even thoughposix_madviseis a purely advisory API and every site falls through to continue execution.On ARM64 kernels with 64KB pages (e.g. NVIDIA Grace, Apple Silicon under some configs), a benign tail-page alignment mismatch on
mmap'd databases produces user-visiblelines via stdout/stderr capture in downstream services (e.g. mmseqs
gpuserver). This suggests a fault where none exists and is confusing for operators triaging real failures.The same call sites also discard
errno, so when a real I/O problem does occur (EIO,EBADF,ENOMEM) there is no way to tell it apart from the benign cases.Change
Introduce
Util::madviseLogged(addr, len, advice, context)incommons/Util.{h,cpp}which:0whenlen == 0orHAVE_POSIX_MADVISEis undefinedstrerror(rc)with the named advice and a caller-supplied context stringSEQUENTIALhints — alwaysWARNING(pure readahead hint, never functional)WILLNEED+EINVAL—WARNING(alignment / VMA-type edge case, manual touch loop still prefaults)WILLNEED+ other errno —ERROR(EIO/EBADF/ENOMEMindicate real problems)Fallback
POSIX_MADV_*macros are provided inUtil.hso call sites compile cleanly whenHAVE_POSIX_MADVISEis undefined; the helper is a no-op in that build configuration.All five sites are refactored onto the helper:
src/commons/Util.cpp(touchMemory)src/commons/DBReader.cpp(setSequentialAdvice)src/linclust/KmerIndex.h(KmerIndexload)src/linclust/kmermatcher.cpp(mergeKmerFilesAndOutput)src/prefiltering/Prefiltering.cpp(mergeTargetSplits)The
#ifdef HAVE_POSIX_MADVISEguards previously wrapping each site are now consolidated inside the helper.Behavior change
No functional change.
Severity classification changes as described above.
Log message format changes: now includes the advice name and
strerror()text.Before:
After (WILLNEED + EINVAL on a 64KB-page ARM64 kernel):
Reproduction of original warning
Run any mmseqs workload that loads a database via
touchMemory(e.g.mmseqs gpuserver) on an ARM64 host with 64KB pages and a database whose mapped length is not a multiple of the page size. The previous build emitsDebug::ERROR; this build emitsDebug::WARNINGwith errno context.