Skip to content

[Feature] Add max-inflight guard for remote policy clients#3897

Draft
vmoens wants to merge 1 commit into
gh/vmoens/292/basefrom
gh/vmoens/292/head
Draft

[Feature] Add max-inflight guard for remote policy clients#3897
vmoens wants to merge 1 commit into
gh/vmoens/292/basefrom
gh/vmoens/292/head

Conversation

[ghstack-poisoned]
@pytorch-bot

pytorch-bot Bot commented Jun 21, 2026

Copy link
Copy Markdown

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3897

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 12 New Failures, 1 Cancelled Job

As of commit bf71148 with merge base d7ef78b (image):

NEW FAILURES - The following jobs have failed:

CANCELLED JOB - The following job was cancelled. Please retry:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@github-actions

Copy link
Copy Markdown
Contributor

Benchmark Results: PR bf71148c vs main 828e28a9

Benchmark run: https://github.com/pytorch/rl/actions/runs/27896116292

Higher ops/sec is better. Tables are sorted by largest absolute change.

CPU

Compared 187 benchmarks. Regressions over 5%: 9. Improvements over 5%: 18.

Benchmark main ops PR ops Change
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 36.61 201.26 +449.72%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1,981 461.13 -76.73%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 192.81 56.76 -70.56%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[True-backward] 91.62 115.08 +25.61%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 2,971 3,659 +23.17%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-backward] 826.24 987.87 +19.56%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1,980 2,357 +19.02%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2,909 3,430 +17.90%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3,679 3,077 -16.38%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 435.14 504.05 +15.84%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 722.72 827.73 +14.53%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 33.26 28.58 -14.08%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 678.58 765.12 +12.75%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 3,346 2,941 -12.11%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-backward] 377.42 419.18 +11.06%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3,116 2,843 -8.76%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[True-backward] 227.55 247.29 +8.67%
benchmarks/test_envs_benchmark.py::test_simple 1.7155 1.8394 +7.23%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1,108 1,032 -6.86%
benchmarks/test_envs_benchmark.py::test_parallel 0.9915 0.9312 -6.08%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape2-large_img] 395.91 419.98 +6.08%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 2,251 2,382 +5.84%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[reduce-overhead-None] 254.66 268.84 +5.57%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-True-True] 21,856 20,642 -5.56%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[safetensors] 23,335 24,589 +5.38%
benchmarks/test_objectives_benchmarks.py::test_redq_speed[reduce-overhead-None] 217.63 229.08 +5.26%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape2-large_img] 385.97 405.45 +5.05%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-single-True] 1.4067 1.3437 -4.48%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2,189 2,285 +4.40%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[pickle] 12,032 12,554 +4.34%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3,247 3,108 -4.30%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 25.08 26.16 +4.30%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[100-img_shape1-atari] 273.71 284.73 +4.02%
benchmarks/test_envs_benchmark.py::test_transformed 0.8961 0.9313 +3.93%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-False-True] 36,053 34,657 -3.87%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[reduce-overhead-None] 1,800 1,869 +3.82%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-cudnn-False-0-gru] 1.3170 1.3654 +3.68%
benchmarks/test_objectives_benchmarks.py::test_values[td1_return_estimate-False-False] 37.57 36.20 -3.65%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[False-backward] 129.66 134.36 +3.63%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[False-backward] 92.75 89.58 -3.42%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3,063 2,960 -3.37%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2,945 2,849 -3.27%
benchmarks/test_collectors_benchmark.py::test_sync 16.65 17.18 +3.21%
benchmarks/test_envs_benchmark.py::test_serial 0.5737 0.5920 +3.17%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 541.40 558.53 +3.16%
benchmarks/test_collectors_benchmark.py::test_single 8.8167 9.0936 +3.14%
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sample_mixed_devices[1000000-memmap_cpu_storage_cpu... 80.84 83.29 +3.04%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[200-img_shape3-large_batch] 752.06 774.58 +3.00%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[False-None] 207.67 213.60 +2.86%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[untyped_storage] 8.9122 8.6582 -2.85%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[200-img_shape3-large_batch] 139.60 143.54 +2.83%
benchmarks/test_objectives_benchmarks.py::test_redq_speed[True-None] 224.99 231.25 +2.78%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[False-backward] 84.28 81.94 -2.77%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-cudnn-False-0-lstm] 0.8517 0.8751 +2.75%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape1-atari] 675.88 694.35 +2.73%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 23.51 24.15 +2.73%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-False-True] 39,964 38,887 -2.69%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[200-img_shape3-large_batch] 308.34 316.54 +2.66%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 197.08 202.29 +2.64%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-True-False] 43,192 44,334 +2.64%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-False] 57,453 58,962 +2.63%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-False-0-lstm] 2.0105 2.0630 +2.61%
benchmarks/test_objectives_benchmarks.py::test_values[vec_td_lambda_return_estimate-True-False] 55.75 54.30 -2.60%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-True-True] 24,544 23,921 -2.54%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[50-img_shape0-small] 3,555 3,469 -2.41%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[True-backward] 133.86 137.08 +2.40%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape1-atari] 635.95 651.21 +2.40%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-True-True] 23,211 22,677 -2.30%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-False-False] 57,318 56,016 -2.27%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-True-True] 19,982 20,432 +2.25%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[True-None] 469.00 478.96 +2.12%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[False-backward] 63.44 62.10 -2.12%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[100-img_shape2-large_img] 170.36 173.95 +2.11%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[reduce-overhead-None] 84.41 82.65 -2.08%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-False-False] 77,909 79,496 +2.04%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[50-img_shape0-small] 7,362 7,214 -2.02%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 0.2154 0.2197 +2.00%
benchmarks/test_objectives_benchmarks.py::test_values[vec_generalized_advantage_estimate-True-True] 55.37 54.26 -2.00%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[True-backward] 125.29 127.77 +1.98%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[False-None] 90.88 89.09 -1.98%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[reduce-overhead-None] 696.91 710.44 +1.94%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[reduce-overhead-None] 280.56 285.85 +1.88%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[False-None] 124.48 122.14 -1.88%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb[100-img_shape0-atari] 30.23 30.77 +1.80%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[True-backward] 57.96 59.00 +1.79%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 52.95 53.90 +1.79%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-True-False] 30,015 30,550 +1.78%
benchmarks/test_objectives_benchmarks.py::test_values[td_lambda_return_estimate-True-False] 25.40 24.95 -1.77%
benchmarks/test_objectives_benchmarks.py::test_values[vec_td1_return_estimate-False-False] 55.23 54.26 -1.76%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[False-backward] 252.03 247.59 -1.76%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[False-None] 179.59 176.47 -1.74%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 657.40 668.65 +1.71%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-None] 1,776 1,806 +1.69%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[True-None] 115.62 117.57 +1.68%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 2,214 2,250 +1.62%
benchmarks/test_collectors_benchmark.py::test_single_with_rb 8.6701 8.8100 +1.61%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 167.20 169.83 +1.57%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[reduce-overhead-None] 562.43 571.24 +1.57%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 0.5397 0.5314 -1.53%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[reduce-overhead-None] 115.98 117.75 +1.53%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-True-True] 21,794 21,463 -1.52%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-False-False] 50,496 51,253 +1.50%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[200-img_shape3-large_batch] 324.61 329.29 +1.44%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-False-False] 50,943 51,675 +1.44%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 0.6296 0.6206 -1.44%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-False-False] 45,645 46,260 +1.35%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-False-0-gru] 2.9879 3.0274 +1.32%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-True-0-gru] 4.2439 4.2990 +1.30%
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sampler_sample_scale[1000000-cpu] 96.51 97.75 +1.29%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 168.37 170.47 +1.25%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[50-img_shape0-small] 882.70 893.61 +1.24%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[torch.save] 7,130 7,045 -1.19%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[numpy] 383,429 379,173 -1.11%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-False-True] 43,385 42,910 -1.09%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 48.55 49.08 +1.08%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-False-True] 31,804 31,464 -1.07%
benchmarks/test_objectives_benchmarks.py::test_redq_speed[False-None] 97.01 96.00 -1.05%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-cudnn-True-0-gru] 1.4212 1.4357 +1.02%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[reduce-overhead-None] 487.30 482.48 -0.99%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb[200-img_shape1-large_batch] 13.64 13.51 -0.97%
... ... ... Showing 120 of 187 comparisons, sorted by absolute change.

GPU

Compared 197 benchmarks. Regressions over 5%: 15. Improvements over 5%: 9.

Benchmark main ops PR ops Change
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 43.00 191.51 +345.41%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 193.50 38.11 -80.31%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 28.09 47.14 +67.83%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[reduce-overhead-None] 82.63 105.49 +27.65%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2,671 3,383 +26.64%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3,653 2,718 -25.58%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 3,370 2,512 -25.46%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 2,838 3,426 +20.74%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3,372 2,817 -16.45%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3,001 3,486 +16.14%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 2,787 3,180 +14.10%
benchmarks/test_collectors_benchmark.py::test_single_with_rb_pixels 5.3278 4.6386 -12.94%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 450.85 399.89 -11.30%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[100-img_shape2-large_img] 521.71 464.47 -10.97%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-backward] 887.92 969.97 +9.24%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1,994 1,822 -8.62%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1,837 1,984 +7.96%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 780.92 719.02 -7.93%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 160.35 148.92 -7.13%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[200-img_shape3-large_batch] 722.88 672.68 -6.94%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-False-True] 41,740 39,221 -6.04%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-False-True] 37,726 35,662 -5.47%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 497.56 471.91 -5.16%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-False-0-lstm] 21.06 19.99 -5.08%
benchmarks/test_envs_benchmark.py::test_simple 1.2217 1.1613 -4.95%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-True-True] 19,992 19,006 -4.93%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-False-0-gru] 22.35 21.31 -4.66%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1,915 2,001 +4.50%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-False-True] 38,215 36,556 -4.34%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[False-backward] 266.18 276.68 +3.94%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-False-True] 28,703 27,609 -3.81%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 21.80 20.98 -3.76%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 518.86 499.78 -3.68%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-True-True] 20,861 20,111 -3.60%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 22.92 22.14 -3.39%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb_cuda[200-img_shape1-large_batch] 8.7559 8.4621 -3.36%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[safetensors] 23,856 23,066 -3.31%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[True-None] 736.89 712.53 -3.31%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[100-img_shape2-large_img] 169.31 163.78 -3.26%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[True-None] 668.74 690.19 +3.21%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-True-True] 19,841 19,207 -3.19%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 981.52 950.33 -3.18%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb[100-img_shape0-atari] 30.32 29.41 -3.02%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb[200-img_shape1-large_batch] 13.52 13.11 -3.02%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb_cuda[200-img_shape1-large_batch] 8.4166 8.1645 -2.99%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-True-0-gru] 46.95 48.35 +2.97%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-True-True] 18,771 18,215 -2.97%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[True-backward] 213.20 219.44 +2.93%
benchmarks/test_objectives_benchmarks.py::test_values[generalized_advantage_estimate-True-True] 47.08 45.71 -2.90%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-True-True] 18,193 17,673 -2.86%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-True-False] 34,657 33,690 -2.79%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape1-atari] 707.46 687.75 -2.79%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-False-True] 34,343 33,393 -2.77%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 742.46 762.60 +2.71%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[50-img_shape0-small] 4,326 4,211 -2.67%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[50-img_shape0-small] 6,067 5,905 -2.66%
benchmarks/test_collectors_benchmark.py::test_single_pixels 6.2465 6.0817 -2.64%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-True-False] 27,568 26,866 -2.55%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb_cuda[100-img_shape0-atari] 16.73 16.32 -2.49%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-True-False] 34,593 33,741 -2.46%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-None] 1,906 1,859 -2.44%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-False-False] 63,951 62,417 -2.40%
benchmarks/test_objectives_benchmarks.py::test_values[vec_generalized_advantage_estimate-True-True] 303.33 310.58 +2.39%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[False-None] 97.16 94.85 -2.38%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-True-False] 38,483 37,580 -2.35%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb_cuda[100-img_shape0-atari] 17.50 17.09 -2.33%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[200-img_shape3-large_batch] 325.52 318.11 -2.28%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb[100-img_shape0-atari] 26.40 25.80 -2.27%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb[200-img_shape1-large_batch] 15.35 15.00 -2.24%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 52.20 51.05 -2.21%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 980.56 958.91 -2.21%
benchmarks/test_objectives_benchmarks.py::test_values[td1_return_estimate-False-False] 19.80 19.37 -2.17%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[reduce-overhead-None] 795.70 812.70 +2.14%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-False-False] 63,923 62,595 -2.08%
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sample_mixed_devices[1000000-cuda_storage_cuda_samp... 1,467 1,438 -1.92%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[True-backward] 345.02 351.53 +1.89%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-True-False] 31,869 31,288 -1.82%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 0.6875 0.6752 -1.79%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 0.5972 0.5865 -1.79%
benchmarks/test_objectives_benchmarks.py::test_values[td_lambda_return_estimate-True-False] 12.00 11.79 -1.78%
benchmarks/test_collectors_benchmark.py::test_single 6.6925 6.5742 -1.77%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape2-large_img] 404.75 397.65 -1.75%
benchmarks/test_collectors_benchmark.py::test_single_with_rb 5.9239 5.8226 -1.71%
benchmarks/test_objectives_benchmarks.py::test_values[td0_return_estimate-False-False] 11,426 11,236 -1.67%
benchmarks/test_envs_benchmark.py::test_transformed 0.6908 0.7023 +1.66%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[True-backward] 357.87 363.36 +1.53%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-backward] 466.93 473.61 +1.43%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 0.6045 0.5960 -1.41%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 164.37 166.60 +1.36%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-False-True] 29,995 29,597 -1.33%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-True-False] 29,255 28,875 -1.30%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-None] 807.63 818.02 +1.29%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[False-backward] 70.70 69.79 -1.28%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1,245 1,229 -1.27%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 160.06 162.09 +1.27%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[False-backward] 77.98 78.96 +1.25%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 1,292 1,276 -1.25%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-True] 32,136 31,737 -1.24%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[reduce-overhead-None] 1,863 1,886 +1.23%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[False-None] 109.01 110.35 +1.23%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[False-None] 53.25 52.61 -1.22%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[untyped_storage] 7.8932 7.7986 -1.20%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 0.5956 0.5887 -1.15%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[False-None] 271.40 268.33 -1.13%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[False-backward] 129.71 131.16 +1.11%
benchmarks/test_objectives_benchmarks.py::test_values[vec_td1_return_estimate-False-False] 841.90 832.56 -1.11%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape1-atari] 654.73 647.69 -1.08%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 0.5151 0.5096 -1.07%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[True-None] 415.83 411.40 -1.07%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[reduce-overhead-None] 107.53 106.41 -1.05%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-single-False] 1.5904 1.5739 -1.04%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 160.06 161.71 +1.03%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[True-None] 739.45 732.06 -1.00%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-single-True] 1.3081 1.2950 -1.00%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[False-None] 96.96 96.01 -0.98%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-False-True] 29,829 30,114 +0.96%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[reduce-overhead-None] 814.52 822.12 +0.93%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 53.42 52.93 -0.91%
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sampler_sample_scale[1000000-cuda] 2,205 2,224 +0.85%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[50-img_shape0-small] 860.09 867.32 +0.84%
... ... ... Showing 120 of 197 comparisons, sorted by absolute change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Collectors Feature New feature Integrations/torch_geometric Integrations Modules

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant