Skip to content

Conversation

@yoniko
Copy link
Owner

@yoniko yoniko commented Mar 7, 2023

This helps to avoid regressions where consecutive compressions use the same tag space with similar data (running zstd -b5e7 enwik8 -B128K reproduces this regression).

yoniko added 2 commits March 7, 2023 12:11
- Adds memory type that is guaranteed to have been initialized at least once in the workspace's lifetime.
- Changes tag space in row hash to be based on init once memory.
- Moves buffers to aligned memory and removes the buffer memory type.
switch(mls)
{
default:
case 4: return ZSTD_hash4PtrS(p, hBits, (U32)hashSalt);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather recommend to use the upper part of hashSalt ((U32)(hashSalt >> 32)),
as it's less predictable than the lower part.

An alternative approach could be to make hashSalt evolution of lower bits less predictable.

/* We want to generate a new salt in case we reset a Cctx, but we always want to use
* 0 when we reset a Cdict */
if(forWho == ZSTD_resetTarget_CCtx) {
ms->hashSalt = ms->hashSalt * 6364136223846793005 + 1; /* based on MUSL rand */
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is probably enough time here to make hashSalt evolution more chaotic, closer to a prng.

@yoniko yoniko force-pushed the tag-space-hash-salting-3528-part2 branch from a8f72d2 to 029b84f Compare March 9, 2023 21:09
yoniko added 3 commits March 9, 2023 13:52
This helps to avoid regressions where consecutive compressions use the same tag space with similar data (running `zstd -b5e7 enwik8 -B128K` reproduces this regression).
@yoniko yoniko force-pushed the tag-space-hash-salting-3528-part2 branch from d4dff59 to 93dcd83 Compare March 9, 2023 21:52
* 0 when we reset a Cdict */
if(forWho == ZSTD_resetTarget_CCtx) {
ms->tagTable = (U16 *) ZSTD_cwksp_reserve_aligned_init_once(ws, tagTableSize);
ms->hashSalt = (U64) ZSTD_hashPtr(&ms->hashSalt, 32, sizeof(ms->hashSalt)) << 32 |
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather go for a stronger mixing, making observation of any portion of bits essentially random (including lower bits).

I would recommend to base it on XXH3_rrmxmx() :
https://github.com/Cyan4973/xxHash/blob/v0.8.1/xxhash.h#L3403

This mixing has been tested and proved strong enough to pass practrand tests,
making it essentially as strong as a PCG random number generator.

@yoniko yoniko force-pushed the tag-space-hash-salting-3528-part2 branch from 93dcd83 to 2543295 Compare March 10, 2023 07:05
return (value >> count) | (U64)(value << ((0U - count) & 0x3F));
}

FORCE_INLINE_TEMPLATE

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: do you need all 3 variants ?
Only ZSTD_rotateRight_U64() seems in use.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All 3 variants are currently in use in ZSTD_row_matchMaskGroupWidth, I opted to moving all of them here from zstd_lazy.c instead of just one as it's the more natural place for them in any case.

@yoniko yoniko force-pushed the init-once-memory-3528-part1 branch 2 times, most recently from 70d69dc to f4aab97 Compare March 13, 2023 17:22
@yoniko yoniko force-pushed the tag-space-hash-salting-3528-part2 branch from 8f2280c to a8c62ff Compare March 13, 2023 18:08
yoniko and others added 4 commits March 13, 2023 11:24
- Adds memory type that is guaranteed to have been initialized at least once in the workspace's lifetime.
- Changes tag space in row hash to be based on init once memory.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants