Skip to content

Commit a8b8a0b

Browse files
committed
Add plot for windows too
1 parent 44da550 commit a8b8a0b

5 files changed

Lines changed: 152 additions & 12 deletions

File tree

README.md

Lines changed: 5 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -38,21 +38,15 @@ A high-performance C++ framework for SIMD (Single Instruction Multiple Data) ope
3838

3939
### Performance Benchmarks
4040

41-
Performance improvements comparing SIMD operations vs. standard operations:
41+
Performance improvements comparing SIMD operations vs. standard operations on different platforms with different compilers.
4242

43-
![Benchmark Speedup](benchmark_results_linux_gcc/consolidated_speedup.png)
43+
#### Linux (GCC)
44+
![Benchmark Speedup Linux](benchmark_results_linux_gcc/consolidated_speedup.png)
4445

45-
## System Information
4646

47-
Tests and benchmarks were run on the following system:
47+
#### Windows (MSVC)
48+
![Benchmark Speedup Windows](benchmark_results_windows_msvc/consolidated_speedup.png)
4849

49-
- **CPU**: 16 cores @ 3294 MHz
50-
- **Cache**:
51-
- L1 Data: 32 KiB (x8)
52-
- L1 Instruction: 32 KiB (x8)
53-
- L2 Unified: 512 KiB (x8)
54-
- L3 Unified: 16384 KiB (x1)
55-
- **Test Date**: May 6, 2025
5650

5751
## Getting Started
5852

666 KB
Loading
580 KB
Loading
Lines changed: 146 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,146 @@
1+
# SIMD Performance Comparison Summary
2+
3+
#### float256 Addition
4+
5+
| Variant | SIMD Time (ms) | Plain Time (ms) | Speedup (x) |
6+
|---------|---------------|----------------|------------|
7+
| 100000 | 0.107 | 0.435 | 4.07x |
8+
9+
#### float256 Subtraction
10+
11+
| Variant | SIMD Time (ms) | Plain Time (ms) | Speedup (x) |
12+
|---------|---------------|----------------|------------|
13+
| 100000 | 0.077 | 0.626 | 8.13x |
14+
15+
#### float256 Multiplication
16+
17+
| Variant | SIMD Time (ms) | Plain Time (ms) | Speedup (x) |
18+
|---------|---------------|----------------|------------|
19+
| 100000 | 0.155 | 0.497 | 3.21x |
20+
21+
#### float256 Division
22+
23+
| Variant | SIMD Time (ms) | Plain Time (ms) | Speedup (x) |
24+
|---------|---------------|----------------|------------|
25+
| 100000 | 0.141 | 0.783 | 5.55x |
26+
27+
#### double256 Addition
28+
29+
| Variant | SIMD Time (ms) | Plain Time (ms) | Speedup (x) |
30+
|---------|---------------|----------------|------------|
31+
| 100000 | 0.077 | 0.255 | 3.31x |
32+
33+
#### double256 Subtraction
34+
35+
| Variant | SIMD Time (ms) | Plain Time (ms) | Speedup (x) |
36+
|---------|---------------|----------------|------------|
37+
| 100000 | 0.073 | 0.247 | 3.38x |
38+
39+
#### double256 Multiplication
40+
41+
| Variant | SIMD Time (ms) | Plain Time (ms) | Speedup (x) |
42+
|---------|---------------|----------------|------------|
43+
| 100000 | 0.090 | 0.338 | 3.76x |
44+
45+
#### double256 Division
46+
47+
| Variant | SIMD Time (ms) | Plain Time (ms) | Speedup (x) |
48+
|---------|---------------|----------------|------------|
49+
| 100000 | 0.120 | 0.471 | 3.92x |
50+
51+
#### int128_with_int32 t_Addition
52+
53+
| Variant | SIMD Time (ms) | Plain Time (ms) | Speedup (x) |
54+
|---------|---------------|----------------|------------|
55+
| 1000000 | 1.940 | 3.500 | 1.80x |
56+
57+
#### int128_with_int32 t_Subtraction
58+
59+
| Variant | SIMD Time (ms) | Plain Time (ms) | Speedup (x) |
60+
|---------|---------------|----------------|------------|
61+
| 1000000 | 2.270 | 2.970 | 1.31x |
62+
63+
#### int128_with_int32 t_Multiplication
64+
65+
| Variant | SIMD Time (ms) | Plain Time (ms) | Speedup (x) |
66+
|---------|---------------|----------------|------------|
67+
| 100000 | 0.075 | 0.272 | 3.63x |
68+
69+
#### int128_with_int16 t_Addition
70+
71+
| Variant | SIMD Time (ms) | Plain Time (ms) | Speedup (x) |
72+
|---------|---------------|----------------|------------|
73+
| 100000 | 0.071 | 0.452 | 6.37x |
74+
75+
#### int128_with_int16 t_Subtraction
76+
77+
| Variant | SIMD Time (ms) | Plain Time (ms) | Speedup (x) |
78+
|---------|---------------|----------------|------------|
79+
| 100000 | 0.072 | 0.470 | 6.53x |
80+
81+
#### int128_with_int16 t_Multiplication
82+
83+
| Variant | SIMD Time (ms) | Plain Time (ms) | Speedup (x) |
84+
|---------|---------------|----------------|------------|
85+
| 100000 | 0.072 | 0.616 | 8.56x |
86+
87+
#### int128_with_int8 t_Addition
88+
89+
| Variant | SIMD Time (ms) | Plain Time (ms) | Speedup (x) |
90+
|---------|---------------|----------------|------------|
91+
| 100000 | 0.058 | 0.922 | 15.90x |
92+
93+
#### int128_with_int8 t_Subtraction
94+
95+
| Variant | SIMD Time (ms) | Plain Time (ms) | Speedup (x) |
96+
|---------|---------------|----------------|------------|
97+
| 100000 | 0.060 | 0.974 | 16.23x |
98+
99+
#### int256_with_int32 t_Addition
100+
101+
| Variant | SIMD Time (ms) | Plain Time (ms) | Speedup (x) |
102+
|---------|---------------|----------------|------------|
103+
| 100000 | 0.087 | 0.457 | 5.25x |
104+
105+
#### int256_with_int32 t_Subtraction
106+
107+
| Variant | SIMD Time (ms) | Plain Time (ms) | Speedup (x) |
108+
|---------|---------------|----------------|------------|
109+
| 100000 | 0.088 | 0.509 | 5.78x |
110+
111+
#### int256_with_int32 t_Multiplication
112+
113+
| Variant | SIMD Time (ms) | Plain Time (ms) | Speedup (x) |
114+
|---------|---------------|----------------|------------|
115+
| 100000 | 0.089 | 0.511 | 5.74x |
116+
117+
#### int256_with_int16 t_Addition
118+
119+
| Variant | SIMD Time (ms) | Plain Time (ms) | Speedup (x) |
120+
|---------|---------------|----------------|------------|
121+
| 100000 | 0.091 | 1.020 | 11.21x |
122+
123+
#### int256_with_int16 t_Subtraction
124+
125+
| Variant | SIMD Time (ms) | Plain Time (ms) | Speedup (x) |
126+
|---------|---------------|----------------|------------|
127+
| 100000 | 0.090 | 0.896 | 9.96x |
128+
129+
#### int256_with_int16 t_Multiplication
130+
131+
| Variant | SIMD Time (ms) | Plain Time (ms) | Speedup (x) |
132+
|---------|---------------|----------------|------------|
133+
| 100000 | 0.101 | 1.270 | 12.57x |
134+
135+
#### int256_with_int8 t_Addition
136+
137+
| Variant | SIMD Time (ms) | Plain Time (ms) | Speedup (x) |
138+
|---------|---------------|----------------|------------|
139+
| 100000 | 0.078 | 1.950 | 25.00x |
140+
141+
#### int256_with_int8 t_Subtraction
142+
143+
| Variant | SIMD Time (ms) | Plain Time (ms) | Speedup (x) |
144+
|---------|---------------|----------------|------------|
145+
| 100000 | 0.079 | 1.900 | 24.05x |
146+

run_tests.bat

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,4 @@ cmake -S . -B build
22
cmake --build build\ --config Release
33
.\build\Release\BasicSIMD_Tests.exe > test_results.txt
44

5-
python3 analyze_benchmarks.py --input_file=test_results.txt --output_dir=benchmark_results_windows_msvc/
5+
python analyze_benchmarks.py --input_file=test_results.txt --output_dir=benchmark_results_windows_msvc/

0 commit comments

Comments
 (0)