Skip to content

field: speed up Element.Bytes#50

Merged
FiloSottile merged 2 commits intoFiloSottile:mainfrom
AlexanderYastrebov:optimize-element-bytes
Feb 7, 2025
Merged

field: speed up Element.Bytes#50
FiloSottile merged 2 commits intoFiloSottile:mainfrom
AlexanderYastrebov:optimize-element-bytes

Conversation

@AlexanderYastrebov
Copy link
Contributor

Write bytes in 64-bit chunks made from adjacent limbs.

goos: linux
goarch: amd64
pkg: filippo.io/edwards25519/field
cpu: Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz
        │   HEAD~1    │                HEAD                 │
        │   sec/op    │   sec/op     vs base                │
Bytes-8   60.31n ± 1%   13.67n ± 2%  -77.34% (p=0.000 n=10)

        │   HEAD~1   │              HEAD              │
        │    B/op    │    B/op     vs base            │
Bytes-8   0.000 ± 0%   0.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

        │   HEAD~1   │              HEAD              │
        │ allocs/op  │ allocs/op   vs base            │
Bytes-8   0.000 ± 0%   0.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

Write bytes in 64-bit chunks made from adjacent limbs.

goos: linux
goarch: amd64
pkg: filippo.io/edwards25519/field
cpu: Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz
        │   HEAD~1    │                HEAD                 │
        │   sec/op    │   sec/op     vs base                │
Bytes-8   60.31n ± 1%   13.67n ± 2%  -77.34% (p=0.000 n=10)

        │   HEAD~1   │              HEAD              │
        │    B/op    │    B/op     vs base            │
Bytes-8   0.000 ± 0%   0.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

        │   HEAD~1   │              HEAD              │
        │ allocs/op  │ allocs/op   vs base            │
Bytes-8   0.000 ± 0%   0.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal
@AlexanderYastrebov
Copy link
Contributor Author

I've observed Element.Bytes could be improved while profiling https://github.com/AlexanderYastrebov/wireguard-vanity-key:

Screenshot from 2025-02-05 22-17-01

Limbs are not available to the user (unlike e.g. Point.ExtendedCoordinates) so I could not fix it in my code without forking or vendoring this lib.
This change also makes Element.Bytes loop-less just like all other Element methods that use limbs directly.

@FiloSottile
Copy link
Owner

Nice! I appreciate the diagram.

Since these are non-trivial changes to the parts of the library that track upstream, would you consider mailing them as Go CLs? If not, have you signed the Google CLA and are ok with me submitting them on your behalf after we merge them here, to keep upstream in sync?

@AlexanderYastrebov
Copy link
Contributor Author

would you consider mailing them as Go CLs?

Sure, I've created golang/go#71603 for this change.

If not, have you signed the Google CLA and are ok with me submitting them on your behalf after we merge them here, to keep upstream in sync?

Yes, I've contributed to stdlib before so this is also ok.

@AlexanderYastrebov
Copy link
Contributor Author

golang/go#71603 was landed, how do we proceed with this one?

@FiloSottile FiloSottile merged commit 4c39688 into FiloSottile:main Feb 7, 2025
2 checks passed
@FiloSottile
Copy link
Owner

Merged, thank you!

@AlexanderYastrebov AlexanderYastrebov deleted the optimize-element-bytes branch February 7, 2025 18:07
AlexanderYastrebov added a commit to AlexanderYastrebov/wireguard-vanity-key that referenced this pull request Feb 7, 2025
Update to include FiloSottile/edwards25519#50
improvement.

goos: linux
goarch: amd64
pkg: github.com/AlexanderYastrebov/wireguard-vanity-key
cpu: Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz
                      │ origin/main │                HEAD                 │
                      │   sec/op    │   sec/op     vs base                │
FindBatchPoint/1024-8   425.1n ± 0%   373.8n ± 0%  -12.07% (p=0.000 n=10)

                      │ origin/main │              HEAD              │
                      │    B/op     │    B/op     vs base            │
FindBatchPoint/1024-8    0.000 ± 0%   0.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

                      │ origin/main │              HEAD              │
                      │  allocs/op  │ allocs/op   vs base            │
FindBatchPoint/1024-8    0.000 ± 0%   0.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal
AlexanderYastrebov added a commit to AlexanderYastrebov/wireguard-vanity-key that referenced this pull request Feb 7, 2025
Update to include FiloSottile/edwards25519#50
improvement.

goos: linux
goarch: amd64
pkg: github.com/AlexanderYastrebov/wireguard-vanity-key
cpu: Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz
                      │ origin/main │                HEAD                 │
                      │   sec/op    │   sec/op     vs base                │
FindBatchPoint/1024-8   425.1n ± 0%   373.8n ± 0%  -12.07% (p=0.000 n=10)

                      │ origin/main │              HEAD              │
                      │    B/op     │    B/op     vs base            │
FindBatchPoint/1024-8    0.000 ± 0%   0.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

                      │ origin/main │              HEAD              │
                      │  allocs/op  │ allocs/op   vs base            │
FindBatchPoint/1024-8    0.000 ± 0%   0.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants