Skip to content

Conversation

@ata-nas
Copy link
Contributor

@ata-nas ata-nas commented Nov 5, 2025

  • BlockAccessors are now Autocloseable
  • BlockAccessors now create hard links for the duration of their
    lifespan
    • This allows us to ensure data is accessible for the duration of the
      accessor's life
  • Tests are updated and migrated
  • Deleted redundant configuration tests
  • Added tests:
    • Tests for subsequent reads to ensure that closing an accessor does
      not affect the data or the ability for a new accessor to access it

Reviewer Notes

  • migrating accessors to an autocloseable variant
  • improving the ability to retrieve data by having hard links

Related Issue(s)

Resolves #1273

@ata-nas ata-nas added this to the 0.23.0 milestone Nov 5, 2025
@ata-nas ata-nas self-assigned this Nov 5, 2025
@ata-nas ata-nas added the pull request label to get past the "label required" check when no label is needed or appropriate. label Nov 5, 2025
@ata-nas ata-nas linked an issue Nov 5, 2025 that may be closed by this pull request
@ata-nas ata-nas force-pushed the 1273-blockaccessors-hardening-improved-error-handling branch 12 times, most recently from 5e07b85 to 3617276 Compare November 6, 2025 15:12
@ata-nas ata-nas marked this pull request as ready for review November 6, 2025 15:12
@ata-nas ata-nas requested review from a team as code owners November 6, 2025 15:12
@ata-nas ata-nas force-pushed the 1273-blockaccessors-hardening-improved-error-handling branch from 3617276 to e75166d Compare November 6, 2025 15:13
Comment on lines +86 to +88
} else {
return new BlockResponse(Code.NOT_FOUND, null);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added this else clause here as I believe this is correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct. NOT_FOUND is we should have it but something is up.
NOT_AVAILABLE is we don't have it but it might exist on another BN

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Nana-EC there is a very real race condition where we pass the check above, because the block exists, but by the time we try to get it, retention could wipe it. In that case the block is not found (we do not have it). Another possibility is that an exception might have occurred, which also results in us not having it at the moment. That was my thought process.

Copy link
Contributor Author

@ata-nas ata-nas Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR creates a hardlink when we issue the accessor. But here we check for availability (checking the ranged set) and then getting the accessor. I was refering to that. If we get the accessor, we are fine, we will have access because of the hardlink. But if the ranged set contains the block, then retention deletes it and then we try to actually get the accessor, we will not be able to. That should be extremely rare though. That is why I added the else clause, null check for the accessor.

Comment on lines -36 to -44
/**
* Constructor.
*/
public FilesHistoricConfig {
Objects.requireNonNull(rootPath);
Objects.requireNonNull(compression);
Preconditions.requireInRange(powersOfTenPerZipFileContents, 1, 6);
Preconditions.requireGreaterOrEqual(blockRetentionThreshold, 0L);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These constructors here and for the other config have been removed as validation must be done by the config dependency. I've also deleted the unit tests for the configs. I think they are redundant and maintaining them is a hassle.

@ata-nas ata-nas force-pushed the 1273-blockaccessors-hardening-improved-error-handling branch from e75166d to db09b26 Compare November 6, 2025 15:31
@ata-nas
Copy link
Contributor Author

ata-nas commented Nov 6, 2025

We could also add tests for:

  1. Clearing the links root path on startup for recents
  2. Clearing the links root path on startup for historic

@ata-nas ata-nas force-pushed the 1273-blockaccessors-hardening-improved-error-handling branch from db09b26 to 1e574a7 Compare November 6, 2025 15:37
@codecov
Copy link

codecov bot commented Nov 6, 2025

Codecov Report

❌ Patch coverage is 66.99029% with 34 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...tream/subscriber/BlockStreamSubscriberSession.java 46.66% 6 Missing and 2 partials ⚠️
...k/node/blocks/files/historic/ZipBlockAccessor.java 65.00% 6 Missing and 1 partial ⚠️
...ode/blocks/files/recent/BlockFileRecentPlugin.java 64.70% 6 Missing ⚠️
.../node/access/service/BlockAccessServicePlugin.java 44.44% 3 Missing and 2 partials ⚠️
...blocks/files/historic/BlockFileHistoricPlugin.java 72.72% 3 Missing ⚠️
...de/blocks/files/recent/BlockFileBlockAccessor.java 83.33% 3 Missing ⚠️
...ck/node/blocks/files/historic/ZipBlockArchive.java 50.00% 2 Missing ⚠️

❌ Your patch check has failed because the patch coverage (66.99%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

@@             Coverage Diff              @@
##               main    #1833      +/-   ##
============================================
- Coverage     81.37%   81.19%   -0.18%     
- Complexity     1164     1173       +9     
============================================
  Files           126      126              
  Lines          5428     5472      +44     
  Branches        578      587       +9     
============================================
+ Hits           4417     4443      +26     
- Misses          751      766      +15     
- Partials        260      263       +3     
Files with missing lines Coverage Δ Complexity Δ
.../hiero/block/node/base/tar/TaredBlockIterator.java 84.53% <100.00%> (+0.32%) 16.00 <0.00> (ø)
...ode/blocks/files/historic/FilesHistoricConfig.java 100.00% <ø> (ø) 1.00 <0.00> (ø)
...ck/node/blocks/files/recent/FilesRecentConfig.java 100.00% <ø> (ø) 1.00 <0.00> (ø)
...block/node/spi/historicalblocks/BlockAccessor.java 75.00% <ø> (+12.50%) 4.00 <0.00> (+2.00)
...ck/node/blocks/files/historic/ZipBlockArchive.java 81.03% <50.00%> (-0.55%) 24.00 <0.00> (ø)
...blocks/files/historic/BlockFileHistoricPlugin.java 83.78% <72.72%> (+1.30%) 26.00 <0.00> (+2.00)
...de/blocks/files/recent/BlockFileBlockAccessor.java 72.00% <83.33%> (+1.26%) 11.00 <4.00> (+3.00)
.../node/access/service/BlockAccessServicePlugin.java 82.00% <44.44%> (-4.96%) 10.00 <0.00> (ø)
...ode/blocks/files/recent/BlockFileRecentPlugin.java 65.00% <64.70%> (-0.46%) 25.00 <0.00> (+2.00) ⬇️
...k/node/blocks/files/historic/ZipBlockAccessor.java 63.63% <65.00%> (+7.38%) 6.00 <1.00> (+1.00)
... and 1 more

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@mustafauzunn mustafauzunn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to add the new paths to block-node/app/build.gradle.kts to not break the local runs

Copy link
Contributor

@mustafauzunn mustafauzunn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should update docs/block-node/configuration.md with the new configs

@ata-nas ata-nas force-pushed the 1273-blockaccessors-hardening-improved-error-handling branch 2 times, most recently from af5273d to 857c37b Compare November 7, 2025 13:33
@ata-nas
Copy link
Contributor Author

ata-nas commented Nov 7, 2025

@mustafauzunn thanks for the notes, I've updated configs md + added env vars in build gradle for local builds.

@ata-nas
Copy link
Contributor Author

ata-nas commented Nov 7, 2025

We could also add tests for:

  1. Clearing the links root path on startup for recents
  2. Clearing the links root path on startup for historic

These have been added.

Copy link
Contributor

@Nana-EC Nana-EC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1st pass good improvements, will circle back

Comment on lines +86 to +88
} else {
return new BlockResponse(Code.NOT_FOUND, null);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct. NOT_FOUND is we should have it but something is up.
NOT_AVAILABLE is we don't have it but it might exist on another BN

Objects.requireNonNull(rootPath);
Objects.requireNonNull(compression);
Preconditions.requireInRange(powersOfTenPerZipFileContents, 1, 6);
Preconditions.requireGreaterOrEqual(blockRetentionThreshold, 0L);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you want to set @Min on blockRetentionThreshold field then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, let me revisit these.

* BlockAccessors are now Autocloseable
* BlockAccessors now create hard links for the duration of their
  lifespan
  * This allows us to ensure data is accessible for the duration of the
    accessor's life
* Tests are updated and migrated
* Deleted redundant configuration tests
* Added tests:
  * Tests for subsequent reads to ensure that closing an accessor does
    not affect the data or the ability for a new accessor to access it

Signed-off-by: Atanas Atanasov <[email protected]>
@ata-nas ata-nas force-pushed the 1273-blockaccessors-hardening-improved-error-handling branch from 857c37b to 1a31bf8 Compare November 7, 2025 15:06
} catch (final RuntimeException e) {
final String message = "Failed to retrieve block number %d.".formatted(request.blockNumber());
LOGGER.log(ERROR, message, e);
responseCounterNotFound.increment(); // Should this be a failure counter?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with comment maybe we should have a failure counter. However, this could be done as part of another issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment may be redundant and we simply need to delete it.

@AlfredoG87 what do we mean here by failure counter? Is that something that is simply grafana UI?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metric should probably be more generic; a failure counter instead of a "not found" counter. That probably works better as a combined "result counter" that counts all results and uses labels to differentiate between success, not found, not available, etc...
That should perhaps wait until we have labels for metrics, however.

@ConfigData("files.historic")
public record FilesHistoricConfig(
@Loggable @ConfigProperty(defaultValue = "/opt/hiero/block-node/data/historic") Path rootPath,
@Loggable @ConfigProperty(defaultValue = "/opt/hiero/block-node/data/historic-links") Path linksRootPath,
Copy link
Contributor

@jsync-swirlds jsync-swirlds Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of totally new paths, why not use a subfolder of the existing path?
e.g. rootPath + "/accessor-links" would separate the folders but ensure it's on the same physical device (which is a requirement for hard links).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially I did it so, the links was subfoldered, but thought that it could be misused. I think what you are suggesting is that I remove the config property and simply create a subdir? I could do that. Do you believe there is a risk or room for error/misuse as the data path is quite important?

Copy link
Contributor

@jsync-swirlds jsync-swirlds Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it (misuse) is something to worry about.
The subfolder and the main blocks folders have to be on the same device, and anything with access to one will likely have access to both.

The benefit of knowing the temp and main storage locations are definitively on the same device is worth more than any misuse concerns, in my opinion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed these tests (and the ones for recent) for configs since I've removed the constructors as well. Nothing could be asserted. Supporting these is a hassle and validations and provision of non-null default values must be done via the config dependency.

import java.nio.file.Files;
import java.nio.file.Path;
import java.util.Objects;
import java.util.UUID;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UUID are costly (and a bit dicey); why not just use a tempfile name (which is guaranteed unique)?
The Files object has methods for creating a tempfile in a specified directory, and that can then be deleted and a hardlink created with the same name (not perfect, but likely better than adding a 128-bit type 4 UUID).

return getBytesFromPath(format, entry, compressionType);
} catch (final UncheckedIOException | IOException e) {
LOGGER.log(WARNING, FAILED_TO_READ_MESSAGE.formatted(blockPath), e);
LOGGER.log(WARNING, FAILED_TO_READ_MESSAGE.formatted(zipFileLink), e);
Copy link
Contributor

@jsync-swirlds jsync-swirlds Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the link should be a hidden internal; pointing to the block path in logs seems much more useful.
Same comment in the log below (line 135).

Comment on lines +18 to +20
environment("FILES_HISTORIC_LINKS_ROOT_PATH", "${serverDataDir}/files-historic-links")
environment("FILES_RECENT_LIVE_ROOT_PATH", "${serverDataDir}/files-live")
environment("FILES_RECENT_LINKS_ROOT_PATH", "${serverDataDir}/files-live-links")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both of these should be subfolders of the same folder used for the actual blocks. It's safer that way and matches what we're doing in other (similar) cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IS this class used?

// attempt to clear any existing links root directory
if (Files.exists(linksRoot)) {
try (final Stream<Path> paths = Files.walk(linksRoot)) {
for (final Path path : paths.sorted(Comparator.reverseOrder()).toList()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why sort the hardlinks?

final Bytes bytes = blockAccessor.blockBytes(format);
// close the accessor as we are done with it and we need to free
// resources
blockAccessor.close();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth defining the batch as an AutoCloseable and Iterable object (instead of just a list) so that close can be handled automatically in try-with-resources by the caller of this method instead of handling the individual close calls here.

In particular I'm concerned what happens to all the remaining not-yet-complete accessors if this method encounters an exception.

// attempt to clear any existing links root directory
if (Files.exists(linksRoot)) {
try (final Stream<Path> paths = Files.walk(linksRoot)) {
for (final Path path : paths.sorted(Comparator.reverseOrder()).toList()) {
Copy link
Contributor

@jsync-swirlds jsync-swirlds Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why sort the list?
I would think that paths.forEach(path -> Files.delete(path)) would be sufficient...

Comment on lines +247 to +249
final BlockFileBlockAccessor accessor =
new BlockFileBlockAccessor(verifiedBlockPath, config, blockNumber);
blocksReadCounter.increment();
Copy link
Contributor

@jsync-swirlds jsync-swirlds Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought for a future PR:
What would happen if we added the metric to the accessor constructor here, and let the accessor increment the counter when the block is read? If an accessor might be read repeatedly, we could get more accurate metrics that way.

@ConfigData("files.recent")
public record FilesRecentConfig(
@Loggable @ConfigProperty(defaultValue = "/opt/hiero/block-node/data/live") Path liveRootPath,
@Loggable @ConfigProperty(defaultValue = "/opt/hiero/block-node/data/live-links") Path linksRootPath,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above, this should probably be a hard-coded subfolder of the liveRootPath rather than a separate configured item.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pull request label to get past the "label required" check when no label is needed or appropriate.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BlockAccessors Hardening & Improved Error Handling

6 participants