Skip to content

[Refactor] Add structured inference server config objects#3893

Draft
vmoens wants to merge 1 commit into
gh/vmoens/288/basefrom
gh/vmoens/288/head
Draft

[Refactor] Add structured inference server config objects#3893
vmoens wants to merge 1 commit into
gh/vmoens/288/basefrom
gh/vmoens/288/head

Conversation

[ghstack-poisoned]
@pytorch-bot

pytorch-bot Bot commented Jun 21, 2026

Copy link
Copy Markdown

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3893

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 12 New Failures, 1 Cancelled Job

As of commit 7201352 with merge base d7ef78b (image):

NEW FAILURES - The following jobs have failed:

CANCELLED JOB - The following job was cancelled. Please retry:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@github-actions

Copy link
Copy Markdown
Contributor

Benchmark Results: PR 7201352f vs main 7fef8524

Benchmark run: https://github.com/pytorch/rl/actions/runs/27896111333

Higher ops/sec is better. Tables are sorted by largest absolute change.

CPU

Compared 187 benchmarks. Regressions over 5%: 13. Improvements over 5%: 21.

Benchmark main ops PR ops Change
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 36.45 200.10 +448.96%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1,993 453.33 -77.26%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 190.46 55.60 -70.81%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2,610 3,245 +24.34%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2,888 3,585 +24.15%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3,713 2,999 -19.24%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[True-backward] 223.30 257.12 +15.14%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-None] 1,603 1,836 +14.50%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1,997 2,282 +14.28%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 915.31 785.37 -14.20%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3,710 3,192 -13.96%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2,900 3,301 +13.80%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 32.98 29.40 -10.85%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[100-img_shape1-atari] 5,296 4,733 -10.65%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 855.30 764.72 -10.59%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-backward] 913.22 1,003 +9.83%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 504.08 552.01 +9.51%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 2,843 3,107 +9.29%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 3,201 3,485 +8.86%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 797.25 730.97 -8.31%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3,471 3,733 +7.56%
benchmarks/test_envs_benchmark.py::test_simple 1.7221 1.8500 +7.42%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 447.11 480.24 +7.41%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[True-backward] 108.97 116.73 +7.12%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-True-False] 36,909 34,438 -6.70%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2,172 2,315 +6.57%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape1-atari] 628.18 666.07 +6.03%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 549.64 516.89 -5.96%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-False-0-gru] 3.1237 2.9396 -5.90%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape2-large_img] 386.74 408.72 +5.68%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-None] 680.75 719.25 +5.65%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[safetensors] 23,224 24,513 +5.55%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape2-large_img] 409.53 430.94 +5.23%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-False-0-lstm] 2.0631 1.9583 -5.08%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[reduce-overhead-None] 113.13 118.75 +4.97%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-cudnn-False-0-gru] 1.3733 1.3079 -4.76%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape1-atari] 689.95 722.60 +4.73%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 2,308 2,412 +4.52%
benchmarks/test_envs_benchmark.py::test_transformed 0.8897 0.9283 +4.33%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[100-img_shape1-atari] 270.98 282.33 +4.19%
benchmarks/test_objectives_benchmarks.py::test_values[td1_return_estimate-False-False] 37.89 39.47 +4.17%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[False-backward] 61.97 64.54 +4.14%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-True-0-gru] 4.2809 4.1071 -4.06%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[torch.save] 6,981 7,253 +3.89%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-True-False] 32,597 31,332 -3.88%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-True-False] 35,054 33,763 -3.68%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[True-None] 274.48 284.55 +3.67%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[pickle] 11,932 12,365 +3.63%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[True-None] 114.91 118.95 +3.52%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-cudnn-True-0-gru] 1.4442 1.3943 -3.45%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 47.96 49.58 +3.36%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[False-None] 177.05 182.84 +3.27%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[False-None] 89.24 92.14 +3.25%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[50-img_shape0-small] 3,505 3,613 +3.10%
benchmarks/test_objectives_benchmarks.py::test_redq_speed[False-backward] 57.85 56.07 -3.09%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-False-False] 64,614 66,607 +3.09%
benchmarks/test_objectives_benchmarks.py::test_values[td_lambda_return_estimate-True-False] 25.94 26.73 +3.08%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-False] 58,470 56,700 -3.03%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[reduce-overhead-None] 479.79 493.82 +2.93%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[False-None] 161.83 166.54 +2.91%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 24.50 25.20 +2.88%
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sampler_sample_scale[1000000-cpu] 97.89 95.07 -2.87%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-False-False] 64,487 62,646 -2.85%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[True-None] 287.51 295.70 +2.85%
benchmarks/test_envs_benchmark.py::test_serial 0.5826 0.5989 +2.81%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[50-img_shape0-small] 4,401 4,523 +2.79%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-True-False] 28,754 29,551 +2.77%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-True-False] 43,072 41,883 -2.76%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-True-False] 27,116 27,863 +2.76%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 2,318 2,382 +2.75%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-False-True] 38,387 37,343 -2.72%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[True-backward] 122.11 125.42 +2.71%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[reduce-overhead-None] 1,832 1,881 +2.71%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 53.40 54.83 +2.67%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-True-False] 32,383 31,540 -2.60%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[generalized_advantage_estimate-False-1-512] 115.57 118.52 +2.55%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[True-backward] 57.15 58.61 +2.55%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 194.21 198.99 +2.46%
benchmarks/test_objectives_benchmarks.py::test_redq_speed[reduce-overhead-None] 239.03 233.18 -2.45%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-True] 33,010 32,204 -2.44%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-True-True] 19,973 19,498 -2.38%
benchmarks/test_objectives_benchmarks.py::test_values[generalized_advantage_estimate-True-True] 102.00 104.32 +2.28%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[untyped_storage] 8.6382 8.4466 -2.22%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[100-img_shape2-large_img] 177.15 173.24 -2.21%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-True-True] 17,881 18,271 +2.18%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[False-backward] 29.17 28.54 -2.15%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[True-None] 475.82 486.05 +2.15%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[False-backward] 83.76 85.56 +2.15%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1,094 1,118 +2.14%
benchmarks/test_envs_benchmark.py::test_parallel 0.9934 0.9724 -2.11%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-False-True] 30,872 30,235 -2.06%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[200-img_shape3-large_batch] 308.98 315.17 +2.00%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[numpy] 365,873 373,202 +2.00%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[200-img_shape3-large_batch] 331.31 337.93 +2.00%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-backward] 418.27 426.48 +1.96%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[False-None] 212.21 216.37 +1.96%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-False-False] 77,739 76,243 -1.92%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-True-True] 21,200 20,799 -1.89%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[False-backward] 132.24 134.73 +1.89%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[True-None] 260.95 265.85 +1.88%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-False-True] 42,536 41,783 -1.77%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[reduce-overhead-None] 333.46 339.33 +1.76%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[reduce-overhead-None] 293.32 288.20 -1.75%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[False-backward] 94.40 92.78 -1.72%
benchmarks/test_collectors_benchmark.py::test_single_with_rb 8.6479 8.7959 +1.71%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[200-img_shape3-large_batch] 140.21 142.61 +1.71%
benchmarks/test_collectors_benchmark.py::test_async 17.62 17.92 +1.69%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-True-True] 20,194 19,863 -1.64%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb[200-img_shape1-large_batch] 15.27 15.52 +1.64%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 23.92 24.30 +1.61%
benchmarks/test_objectives_benchmarks.py::test_values[td0_return_estimate-False-False] 7,981 8,107 +1.59%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 661.99 672.45 +1.58%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb[100-img_shape0-atari] 30.04 30.51 +1.57%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[200-img_shape3-large_batch] 767.57 756.39 -1.46%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-cudnn-True-0-lstm] 0.9538 0.9401 -1.45%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-False-False] 44,221 44,855 +1.43%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-False-False] 50,358 49,639 -1.43%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[False-backward] 79.01 80.09 +1.35%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[reduce-overhead-None] 287.95 291.80 +1.34%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-True-0-lstm] 3.0886 3.1273 +1.25%
... ... ... Showing 120 of 187 comparisons, sorted by absolute change.

GPU

Compared 197 benchmarks. Regressions over 5%: 12. Improvements over 5%: 10.

Benchmark main ops PR ops Change
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[100-img_shape1-atari] 2,832 4,081 +44.09%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[reduce-overhead-None] 75.28 107.87 +43.29%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[reduce-overhead-None] 82.88 105.68 +27.51%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3,458 2,701 -21.89%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2,846 3,468 +21.85%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2,242 1,778 -20.69%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3,114 3,686 +18.37%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3,125 3,643 +16.58%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1,011 854.15 -15.54%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 699.17 787.31 +12.61%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 3,275 2,863 -12.56%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 44.08 38.63 -12.37%
benchmarks/test_collectors_benchmark.py::test_single_with_rb_pixels 5.3818 4.7720 -11.33%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3,368 3,013 -10.52%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2,937 2,633 -10.37%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3,126 2,847 -8.91%
benchmarks/test_objectives_benchmarks.py::test_values[vec_generalized_advantage_estimate-True-True] 281.32 306.21 +8.85%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 737.78 792.09 +7.36%
benchmarks/test_collectors_benchmark.py::test_sync_pixels 10.39 9.7375 -6.24%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape1-atari] 721.57 680.88 -5.64%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[200-img_shape3-large_batch] 135.00 142.34 +5.43%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[200-img_shape3-large_batch] 760.47 719.76 -5.35%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[reduce-overhead-None] 1,820 1,908 +4.80%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 477.80 500.55 +4.76%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[100-img_shape2-large_img] 177.57 169.13 -4.75%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[True-backward] 395.89 377.58 -4.63%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-backward] 474.65 453.18 -4.52%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 479.63 499.87 +4.22%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 2,014 1,930 -4.20%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape1-atari] 668.97 641.88 -4.05%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-True-True] 20,183 19,369 -4.03%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[True-backward] 350.63 336.68 -3.98%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-True-False] 34,939 33,558 -3.95%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2,106 2,023 -3.90%
benchmarks/test_envs_benchmark.py::test_simple 1.2553 1.2083 -3.75%
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sample_mixed_devices[1000000-cuda_storage_cuda_samp... 1,501 1,446 -3.63%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-False-True] 37,242 38,489 +3.35%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 742.24 717.94 -3.27%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[False-backward] 240.48 233.16 -3.04%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-False-False] 45,547 44,171 -3.02%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1,016 986.23 -2.95%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-True-True] 19,088 18,528 -2.93%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-True-False] 27,853 27,045 -2.90%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 21.57 22.19 +2.89%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[False-backward] 80.11 77.96 -2.68%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 23.83 23.24 -2.49%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 475.34 486.92 +2.44%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 1,313 1,345 +2.41%
benchmarks/test_envs_benchmark.py::test_transformed 0.7022 0.7191 +2.40%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-False-True] 36,725 37,583 +2.34%
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sample_mixed_devices[1000000-cuda_storage_cpu_sampler] 88.21 90.18 +2.24%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 1,320 1,292 -2.08%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-True-True] 23,646 24,117 +1.99%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[reduce-overhead-None] 795.81 811.59 +1.98%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[False-None] 338.95 345.56 +1.95%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[50-img_shape0-small] 4,395 4,309 -1.95%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[reduce-overhead-None] 87.91 89.60 +1.92%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-backward] 884.88 901.75 +1.91%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb[100-img_shape0-atari] 30.90 30.33 -1.84%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 186.67 190.07 +1.82%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[generalized_advantage_estimate-False-1-512] 49.13 48.24 -1.82%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-True-True] 20,845 20,468 -1.81%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 52.59 51.66 -1.77%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 196.36 192.95 -1.74%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb_cuda[200-img_shape1-large_batch] 8.5242 8.3784 -1.71%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-False-False] 63,808 62,717 -1.71%
benchmarks/test_objectives_benchmarks.py::test_values[td1_return_estimate-False-False] 20.53 20.19 -1.67%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[reduce-overhead-None] 43.64 42.93 -1.62%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb_cuda[200-img_shape1-large_batch] 8.8273 8.6849 -1.61%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[50-img_shape0-small] 5,968 5,874 -1.58%
benchmarks/test_envs_benchmark.py::test_parallel 0.5461 0.5377 -1.55%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-False] 57,727 56,836 -1.54%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 48.19 48.93 +1.54%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[numpy] 372,714 378,427 +1.53%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-True-False] 31,986 31,507 -1.50%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-None] 1,891 1,919 +1.45%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 0.6085 0.6172 +1.43%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 0.2103 0.2133 +1.42%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[100-img_shape1-atari] 279.04 275.07 -1.42%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[200-img_shape3-large_batch] 333.65 328.92 -1.42%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb_cuda[100-img_shape0-atari] 16.83 16.60 -1.38%
benchmarks/test_objectives_benchmarks.py::test_values[generalized_advantage_estimate-True-True] 49.08 48.42 -1.35%
benchmarks/test_collectors_benchmark.py::test_async_pixels 10.63 10.76 +1.28%
benchmarks/test_objectives_benchmarks.py::test_values[td_lambda_return_estimate-True-False] 12.38 12.22 -1.26%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 162.25 164.27 +1.24%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[True-backward] 217.20 219.88 +1.23%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-False-False] 64,410 63,625 -1.22%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[False-None] 113.53 112.18 -1.19%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[True-backward] 344.24 348.31 +1.18%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape2-large_img] 390.31 394.79 +1.15%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[True-backward] 245.62 242.93 -1.10%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[False-backward] 40.83 40.38 -1.09%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-None] 821.43 830.15 +1.06%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-True-False] 34,846 34,491 -1.02%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[safetensors] 23,711 23,952 +1.02%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-True] 33,083 32,748 -1.01%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[reduce-overhead-None] 104.54 105.59 +1.00%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[torch.save] 7,108 7,179 +1.00%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 167.41 169.06 +0.98%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-True-False] 29,082 28,798 -0.98%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[50-img_shape0-small] 3,561 3,526 -0.97%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[False-backward] 272.88 270.28 -0.95%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-True-False] 38,591 38,224 -0.95%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 0.5359 0.5409 +0.93%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[False-backward] 69.53 68.88 -0.93%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 165.37 166.89 +0.92%
benchmarks/test_envs_benchmark.py::test_serial 0.4240 0.4276 +0.85%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[True-backward] 258.74 260.90 +0.84%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-False-True] 28,514 28,746 +0.81%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 54.01 54.44 +0.78%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb_cuda[100-img_shape0-atari] 17.67 17.53 -0.78%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape2-large_img] 407.16 404.09 -0.75%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-True-False] 31,954 31,715 -0.75%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 162.09 163.27 +0.72%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[reduce-overhead-None] 865.37 871.57 +0.72%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-single-False] 1.6054 1.6159 +0.66%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb[200-img_shape1-large_batch] 15.22 15.12 -0.62%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[True-None] 770.25 765.49 -0.62%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[reduce-overhead-None] 130.97 131.77 +0.62%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[reduce-overhead-None] 829.75 834.77 +0.60%
... ... ... Showing 120 of 197 comparisons, sorted by absolute change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Collectors Documentation Improvements or additions to documentation Integrations/torch_geometric Integrations Modules Refactoring Refactoring of an existing feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant