Topology-aware default pool sizing for PinnedMemoryResource#1012
Topology-aware default pool sizing for PinnedMemoryResource#1012rapids-bot[bot] merged 22 commits intorapidsai:mainfrom
PinnedMemoryResource#1012Conversation
Signed-off-by: niranda perera <niranda.perera@gmail.com>
Signed-off-by: niranda perera <niranda.perera@gmail.com>
ce335cb to
7bbd171
Compare
|
|
||
| std::uint64_t get_host_memory_per_gpu() { | ||
| auto const num_gpus = get_topology().num_gpus; | ||
| return get_total_host_memory() / std::max<std::uint64_t>(1, num_gpus); |
There was a problem hiding this comment.
Can you double-check how get_total_host_memory behaves on devices where GPU memory is also exposed as NUMA nodes? In those cases free will also show GPU memory as part of system memory, which may break the assumption get_total_host_memory presents only system memory.
IIRC, the conditions for GPU memory to be exposed as system memory are:
- Coherent system (e.g., Grace-Hopper/Grace-Blackwell)
- NVIDIA driver in NUMA mode (not CDMM)
- HMM enabled on the kernel
There was a problem hiding this comment.
Ah! this is a good point.
There was a problem hiding this comment.
Nice that there was a simple solution. I also didn't think of the bigger picture of the problem, we cannot just take the whole host memory and divide equally among GPUs, since their NUMA nodes may have different amount of memory, and looks like your new solution already ensures that.
Signed-off-by: niranda perera <niranda.perera@gmail.com>
madsbk
left a comment
There was a problem hiding this comment.
Looks good, I only have minor stuff
Co-authored-by: Mads R. B. Kristensen <madsbk@gmail.com>
Signed-off-by: niranda perera <niranda.perera@gmail.com>
|
/merge |
Signed-off-by: niranda perera <niranda.perera@gmail.com>
Signed-off-by: niranda perera <niranda.perera@gmail.com>
pentschev
left a comment
There was a problem hiding this comment.
Please update the description to reflect recent changes @nirandaperera .
Temporarily blocking to ensure PR and Python get_host_memory_per_gpu() are updated.
|
I think there's some partial GH UI breakage, I cannot comment on the files now for some reason , but |
Signed-off-by: niranda perera <niranda.perera@gmail.com>
Signed-off-by: niranda perera <niranda.perera@gmail.com>
Signed-off-by: niranda perera <niranda.perera@gmail.com>
Signed-off-by: niranda perera <niranda.perera@gmail.com>
|
|
||
| using rapidsmpf::safe_cast; | ||
|
|
||
| // When the RAPIDSMPF_SMOKE_TEST_MODE env var is set (to any value), each |
There was a problem hiding this comment.
This is a bad pattern. One will almost definitely one day attempt RAPIDSMPF_SMOKE_TEST_MODE=0 and will be confused. A better approach would be adding a CLI argument, like all other bench_* binaries we have.
There was a problem hiding this comment.
@pentschev Well, this is a bit tricky. Passing custom args might not work here, because in google benchmarks, Apply(...) runs during static initialization. So, to make it work, we have to parse args before registering benchmarks. That means, we need a custom main, parse smoke-test arg, and then move all bench registrations after that. I'd rather keep this simple like this.
| // We use an env var rather than a CLI flag because google-benchmark's | ||
| // BENCHMARK(...)->Apply(...) macros run during static initialization, before | ||
| // main() has a chance to parse argv. A CLI-flag approach would require moving | ||
| // every benchmark registration into main() (via benchmark::RegisterBenchmark), | ||
| // which is more invasive. std::getenv works fine during static init. |
|
/merge |
jameslamb
left a comment
There was a problem hiding this comment.
approving for ci-codeowners / packaging-codeowners, the changes that pulled in those groups look non-controversial.
Replace the hardcoded
pinned_initial_pool_size = 0/pinned_max_pool_size = unboundeddefaults with values scaled to the system's GPU count.
Changes
get_host_memory_per_gpu()— new utility insystem_infothat returnscurrent numa node total host memory/ N GPUs in current numa nodePinnedMemoryResource— two new named constants drive the defaults:DefaultInitiPoolSizeFactor = "10%"→ initial pool = 10% of per-GPU host memoryDefaultMaxPoolSizeFactor = "80%"→ max pool = 80% of per-GPU host memoryAlso corrects the documented default from
falsetotrue.Python bindings — exposes
get_host_memory_per_gpu()viasystem_info.pyx/.pyi.Tests —
PinnedResource.from_default_optionsverifies pool sizes match thefactor-scaled values when no options are provided.
Benchmark -- Benchmark initialization time against initial pool size - Results
Docs —
configuration.mdupdated to reflect the new defaults.