-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Pull requests: NVIDIA/cutlass
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix cross-execution-space error: remove CUTLASS_HOST_DEVICE from CudaHostAdapter::memsetDevice
#3286
opened May 29, 2026 by
alexngUNC
Loading…
NFC: Replace deprecated cute.make_fragment with cute.make_rmem_tensor
#3285
opened May 29, 2026 by
brandon-yujie-sun
Collaborator
Loading…
[SM120] Add ptr-array TMA collective for tensor/token-scaled FP8 grouped GEMM
#3280
opened May 27, 2026 by
tgmerritt
Loading…
fix: missing PDL wait on main_sf_load in sm103 blockscaled GEMM
#3279
opened May 27, 2026 by
tianyuxbear
Loading…
Fix the ScatterD issue in predicated_tile_iterator
#3278
opened May 27, 2026 by
pengpeng-yu
Loading…
[CuTeDSL] Fix the incorrect import of from_dlpack from docs
#3276
opened May 26, 2026 by
kainzhong
Loading…
[CuTeDSL] Lower scalar Float16/BFloat16 load through Uint16+bitcast
#3267
opened May 24, 2026 by
cheshire
Contributor
Loading…
fix: handle ComposedLayout slicing with dynamic strides (fixes #3255)
#3261
opened May 22, 2026 by
zhils
Loading…
[DOC] Drop empty duplicated Numeric Conversion code block from fundamental_types.md
#3260
opened May 22, 2026 by
adityasingh2400
Loading…
3 tasks done
fix(CuTeDSL): restore trailing Int<1> dimension in SM90 MMA atom TV L…
#3258
opened May 21, 2026 by
zhils
Loading…
[Cutlass SM90] Per-group aux TMA descriptor update for grouped GEMM + Gated-SwiGLU example)
#3256
opened May 21, 2026 by
Butterfingrz
Loading…
test: add CPU-only unit coverage for sharding helpers
#3250
opened May 19, 2026 by
Pritiks23
Loading…
fix(base_dsl): drop ArchMeta alias so Arch.sm_*.value is correct
#3248
opened May 19, 2026 by
lingolin128
Loading…
Fix FastDivmod divisor SSA transport for kernel regions (#3243)
#3246
opened May 19, 2026 by
zhils
Loading…
[cutlass-library] Alias cutlass_lib to the static target when shared is off (fixes #3179)
#3245
opened May 18, 2026 by
LeSingh1
Loading…
[CuTe] Add missing include for smem_ptr_flag_bits in print_tensor.hpp
#3244
opened May 18, 2026 by
LeSingh1
Loading…
[fast_math] Add bfloat16_t PTX specializations for fast_exp and fast_tanh
#3242
opened May 16, 2026 by
VittoriaLanzo
Loading…
5 tasks done
Fix MSVC CUDA build: is_unsigned_v not available in cutlass::platform
#3229
opened May 12, 2026 by
TxsharDev
Loading…
[example Cute C++]Add CuTe C++ tutorial for Blackwell MXFP8 block-scaled GEMM.
#3225
opened May 11, 2026 by
haowen-han
Loading…
Previous Next
ProTip!
Exclude everything labeled
bug with -label:bug.