Skip to content

Fix Key History Dirty Read Bug#449

Merged
dillongeorge merged 2 commits intofacebook:mainfrom
dillongeorge:key-history-dirty-reads-bug
Aug 23, 2024
Merged

Fix Key History Dirty Read Bug#449
dillongeorge merged 2 commits intofacebook:mainfrom
dillongeorge:key-history-dirty-reads-bug

Conversation

@dillongeorge
Copy link
Contributor

Bug
In the key_history API exposed in directory.rs, it is possible that states read from storage are not fully committed yet (i.e. the transaction has not been committed), but the associated epoch in the states is higher than what is seen in the aZKS entry read from storage. In fact, we explicitly account for that scenario when inspecting versions across the states: https://github.com/facebook/akd/blob/main/akd/src/directory.rs#L484-L490.

Prior to this patch, we always initiated the end_version variable at 0. However, the end_version variable should never be 0 since it will panic in downstream API calls: https://github.com/facebook/akd/blob/main/akd_core/src/utils.rs#L182-L184. The bug in this case is simply that we default end_version to 0, but do not update it in the event that we have performed a dirty read for states.

To resolve the bug, we default end_version to the same value as start_version.

Verification
Prior to updating directory.rs, the added tests to repro cases where a dirty read is performed and a newer epoch is returned as part of the KeyData fails with the expected panic:

failures:

---- tests::test_errors::test_key_history_dirty_reads_experimental_config stdout ----
thread 'tests::test_errors::test_key_history_dirty_reads_experimental_config' panicked at akd_core/src/utils.rs:183:9:
find_max_index_in_skiplist called with input less than smallest element of MARKER_VERSION_SKIPLIST
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

---- tests::test_errors::test_key_history_dirty_reads_whatsapp_v1_config stdout ----
thread 'tests::test_errors::test_key_history_dirty_reads_whatsapp_v1_config' panicked at akd_core/src/utils.rs:183:9:
find_max_index_in_skiplist called with input less than smallest element of MARKER_VERSION_SKIPLIST

failures:
    tests::test_errors::test_key_history_dirty_reads_experimental_config
    tests::test_errors::test_key_history_dirty_reads_whatsapp_v1_config

test result: FAILED. 100 passed; 2 failed; 0 ignored; 0 measured; 0 filtered out; finished in 4.36s

After updating directory.rs, the panic no longer occurs.

@dillongeorge dillongeorge requested a review from kevinlewi August 23, 2024 09:41
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 23, 2024
@dillongeorge dillongeorge requested a review from afterdusk August 23, 2024 09:42
@dillongeorge dillongeorge force-pushed the key-history-dirty-reads-bug branch from be7b21c to fe0dc4f Compare August 23, 2024 09:47
@dillongeorge dillongeorge marked this pull request as ready for review August 23, 2024 09:55
}

if start_version == 0 {
if start_version == 0 || end_version == 0 {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Realistically, end_version should never be 0 assuming start_version isn't (i.e. this will short circuit). I'm opting to be a bit more explicit here to err on the side of caution given the bug we're fixing :)

@codecov-commenter
Copy link

codecov-commenter commented Aug 23, 2024

Codecov Report

Attention: Patch coverage is 77.77778% with 2 lines in your changes missing coverage. Please review.

Project coverage is 88.03%. Comparing base (3ce5335) to head (5818ea1).
Report is 17 commits behind head on main.

Files Patch % Lines
akd/src/directory.rs 60.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #449      +/-   ##
==========================================
- Coverage   88.61%   88.03%   -0.58%     
==========================================
  Files          39       38       -1     
  Lines        9109     8283     -826     
==========================================
- Hits         8072     7292     -780     
+ Misses       1037      991      -46     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Dillon George added 2 commits August 23, 2024 11:42
**Bug**
In the `key_history` API exposed in `directory.rs`, it is possible that states read from
storage are not fully committed yet (i.e. the transaction has not been committed), but the
associated epoch in the states is higher than what is seen in the `aZKS` entry read from
storage. In fact, we explicitly account for that scenario when inspecting versions across
the states: https://github.com/facebook/akd/blob/main/akd/src/directory.rs#L484-L490.

Prior to this patch, we always initiated the `end_version` variable at 0. However, the
`end_version` variable should **never** be 0 since it will panic in downstream API calls:
https://github.com/facebook/akd/blob/main/akd_core/src/utils.rs#L182-L184. The bug in this
case is simply that we default `end_version` to 0, but do not update it in the event that
we have performed a dirty read for states.

To resolve the bug, we default `end_version` to the same value as `start_version`.

**Verification**
Prior to updating `directory.rs`, the added tests to repro cases where a dirty read
is performed and a newer epoch is returned as part of the `KeyData` fails with the expected panic:

```
failures:

---- tests::test_errors::test_key_history_dirty_reads_experimental_config stdout ----
thread 'tests::test_errors::test_key_history_dirty_reads_experimental_config' panicked at akd_core/src/utils.rs:183:9:
find_max_index_in_skiplist called with input less than smallest element of MARKER_VERSION_SKIPLIST
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

---- tests::test_errors::test_key_history_dirty_reads_whatsapp_v1_config stdout ----
thread 'tests::test_errors::test_key_history_dirty_reads_whatsapp_v1_config' panicked at akd_core/src/utils.rs:183:9:
find_max_index_in_skiplist called with input less than smallest element of MARKER_VERSION_SKIPLIST

failures:
    tests::test_errors::test_key_history_dirty_reads_experimental_config
    tests::test_errors::test_key_history_dirty_reads_whatsapp_v1_config

test result: FAILED. 100 passed; 2 failed; 0 ignored; 0 measured; 0 filtered out; finished in 4.36s
```

After updating `directory.rs`, the panic no longer occurs.
@dillongeorge dillongeorge force-pushed the key-history-dirty-reads-bug branch from fe0dc4f to 5818ea1 Compare August 23, 2024 18:44
Copy link
Contributor

@kevinlewi kevinlewi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@dillongeorge dillongeorge merged commit cefaeeb into facebook:main Aug 23, 2024
@dillongeorge dillongeorge deleted the key-history-dirty-reads-bug branch August 30, 2024 20:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants