Commit 8d0d405
x/crypto/chacha20: cleanup chacha_ppc64le.s
- Adding PCALIGN before the loops
- Changing WORD directive with corresponding Vector Merge EVEN/ODD word instructions
- Replacing Branch Conditional (BC) with its extended mnemonic form BDNZ
- VPERMXOR instruction usage in place of VXOR instructions followed by
VRLW (rotate left) for cases of rotating in multiples of 8. This
replacements give performace improvement both in time and space of around 7%-8% as listed below
using benchstat tool.
goos: linux
goarch: ppc64le
pkg: golang.org/x/crypto/chacha20
cpu: POWER10
| chacha20.prev.out | chacha20.new.out |
| sec/op | sec/op vs base |
ChaCha20/64 171.9n ± 0% 156.6n ± 1% -8.90% (p=0.002 n=6)
ChaCha20/256 165.5n ± 0% 152.4n ± 0% -7.92% (p=0.002 n=6)
ChaCha20/10x25 505.8n ± 0% 504.3n ± 2% -0.32% (p=0.589 n=6)
ChaCha20/4096 2.265µ ± 0% 2.052µ ± 0% -9.40% (p=0.002 n=6)
ChaCha20/100x40 5.359µ ± 3% 5.018µ ± 2% -6.37% (p=0.002 n=6)
ChaCha20/65536 35.71µ ± 0% 32.29µ ± 0% -9.57% (p=0.002 n=6)
ChaCha20/1000x65 44.63µ ± 0% 41.05µ ± 0% -8.02% (p=0.002 n=6)
geomean 2.235µ 2.073µ -7.26%
| chacha20.prev.out | chacha20.new.out |
| B/s | B/s vs base |
ChaCha20/64 355.1Mi ± 0% 389.8Mi ± 1% +9.78% (p=0.002 n=6)
ChaCha20/256 1.440Gi ± 0% 1.565Gi ± 0% +8.62% (p=0.002 n=6)
ChaCha20/10x25 471.3Mi ± 0% 472.8Mi ± 2% +0.31% (p=0.589 n=6)
ChaCha20/4096 1.684Gi ± 0% 1.859Gi ± 0% +10.38% (p=0.002 n=6)
ChaCha20/100x40 711.8Mi ± 3% 760.3Mi ± 2% +6.80% (p=0.002 n=6)
ChaCha20/65536 1.709Gi ± 0% 1.890Gi ± 0% +10.59% (p=0.002 n=6)
ChaCha20/1000x65 1.356Gi ± 0% 1.475Gi ± 0% +8.72% (p=0.002 n=6)
geomean 957.3Mi 1.008Gi +7.83%
Change-Id: Ib31cb10a2a11eacdacf0272fbfd887eb5ccd8bcb
Reviewed-on: https://go-review.googlesource.com/c/crypto/+/564797
Reviewed-by: Lynn Boger <[email protected]>
Run-TryBot: Paul Murphy <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
Reviewed-by: David Chase <[email protected]>
LUCI-TryBot-Result: Go LUCI <[email protected]>
Run-TryBot: Lynn Boger <[email protected]>
Reviewed-by: Cherry Mui <[email protected]>1 parent b91329d commit 8d0d405
1 file changed
+52
-58
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
33 | 33 | | |
34 | 34 | | |
35 | 35 | | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
36 | 39 | | |
37 | 40 | | |
38 | 41 | | |
| |||
53 | 56 | | |
54 | 57 | | |
55 | 58 | | |
56 | | - | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
57 | 64 | | |
58 | 65 | | |
59 | 66 | | |
| |||
70 | 77 | | |
71 | 78 | | |
72 | 79 | | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
73 | 83 | | |
74 | 84 | | |
75 | 85 | | |
| |||
87 | 97 | | |
88 | 98 | | |
89 | 99 | | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
90 | 104 | | |
91 | 105 | | |
92 | 106 | | |
| |||
97 | 111 | | |
98 | 112 | | |
99 | 113 | | |
100 | | - | |
| 114 | + | |
101 | 115 | | |
102 | 116 | | |
103 | 117 | | |
| |||
128 | 142 | | |
129 | 143 | | |
130 | 144 | | |
131 | | - | |
| 145 | + | |
132 | 146 | | |
133 | 147 | | |
134 | 148 | | |
135 | 149 | | |
136 | 150 | | |
137 | 151 | | |
138 | | - | |
139 | | - | |
140 | | - | |
141 | | - | |
142 | | - | |
143 | | - | |
144 | | - | |
145 | | - | |
146 | | - | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
147 | 156 | | |
148 | 157 | | |
149 | 158 | | |
| |||
165 | 174 | | |
166 | 175 | | |
167 | 176 | | |
168 | | - | |
169 | | - | |
170 | | - | |
171 | | - | |
172 | | - | |
173 | | - | |
174 | | - | |
175 | | - | |
176 | | - | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
177 | 181 | | |
178 | 182 | | |
179 | 183 | | |
| |||
195 | 199 | | |
196 | 200 | | |
197 | 201 | | |
198 | | - | |
199 | | - | |
200 | | - | |
201 | | - | |
202 | | - | |
203 | | - | |
204 | | - | |
205 | | - | |
206 | | - | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
207 | 206 | | |
208 | 207 | | |
209 | 208 | | |
| |||
225 | 224 | | |
226 | 225 | | |
227 | 226 | | |
228 | | - | |
229 | | - | |
230 | | - | |
231 | | - | |
232 | | - | |
233 | | - | |
234 | | - | |
235 | | - | |
236 | | - | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
237 | 231 | | |
238 | 232 | | |
239 | 233 | | |
| |||
249 | 243 | | |
250 | 244 | | |
251 | 245 | | |
252 | | - | |
| 246 | + | |
253 | 247 | | |
254 | 248 | | |
255 | 249 | | |
256 | | - | |
257 | | - | |
| 250 | + | |
| 251 | + | |
258 | 252 | | |
259 | | - | |
260 | | - | |
| 253 | + | |
| 254 | + | |
261 | 255 | | |
262 | | - | |
263 | | - | |
| 256 | + | |
| 257 | + | |
264 | 258 | | |
265 | 259 | | |
266 | 260 | | |
267 | 261 | | |
268 | 262 | | |
269 | 263 | | |
270 | | - | |
271 | | - | |
| 264 | + | |
| 265 | + | |
272 | 266 | | |
273 | | - | |
274 | | - | |
| 267 | + | |
| 268 | + | |
275 | 269 | | |
276 | 270 | | |
277 | 271 | | |
278 | 272 | | |
279 | 273 | | |
280 | 274 | | |
281 | | - | |
282 | | - | |
| 275 | + | |
| 276 | + | |
283 | 277 | | |
284 | | - | |
285 | | - | |
| 278 | + | |
| 279 | + | |
286 | 280 | | |
287 | 281 | | |
288 | 282 | | |
289 | 283 | | |
290 | 284 | | |
291 | 285 | | |
292 | | - | |
293 | | - | |
| 286 | + | |
| 287 | + | |
294 | 288 | | |
295 | 289 | | |
296 | 290 | | |
| |||
431 | 425 | | |
432 | 426 | | |
433 | 427 | | |
434 | | - | |
| 428 | + | |
435 | 429 | | |
436 | 430 | | |
437 | 431 | | |
438 | 432 | | |
439 | 433 | | |
440 | 434 | | |
441 | 435 | | |
442 | | - | |
| 436 | + | |
443 | 437 | | |
444 | 438 | | |
445 | 439 | | |
| |||
0 commit comments