Skip to content

Conversation

@gammazero
Copy link
Contributor

@gammazero gammazero commented Aug 1, 2024

When the chunker is not able to fill a chunk with data, it allocated a new buffer for the partial chunk of data. With many files this results in allocation of many small buffers of varying sizes, leading to heap fragmentation. This PR allocates a new buffer from the pool only if doing so would save space, otherwise it uses an partially filled (over-allocated) chunk. This makes all chunker allocation sizes be powers of 2.

Heap fragmentation is reduced at the cost of some temporary over-allocation. The advantage is that the overallocation is much shorter lived than the heap fragmentation.

Possible fix for issue #647

When the chunker is not able to fill a chunk with data, it allocated a new buffer for the partial chunk of data. With many files this results in allocation of many small buffers of varying sizes, leading to heap fragmentation. This PR allocates a new buffer from the pool only if doing so would save space, otherwise it uses an partially filled (over-allocated) chunk. This makes all chunker allocation sizes be powers of 2.

Heap fragmentation is reduced at the cost of some temporary over-allocation. The advantage is that the overallocation is much shorter lived than the heap fragmentation.
@gammazero gammazero requested a review from a team as a code owner August 1, 2024 20:31
@codecov
Copy link

codecov bot commented Aug 1, 2024

Codecov Report

Attention: Patch coverage is 85.18519% with 4 lines in your changes missing coverage. Please review.

Project coverage is 59.92%. Comparing base (88beadf) to head (a5a4645).
Report is 2 commits behind head on main.

Files Patch % Lines
chunker/splitting.go 85.18% 3 Missing and 1 partial ⚠️

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #649      +/-   ##
==========================================
+ Coverage   59.89%   59.92%   +0.02%     
==========================================
  Files         237      237              
  Lines       29955    29972      +17     
==========================================
+ Hits        17942    17960      +18     
+ Misses      10398    10396       -2     
- Partials     1615     1616       +1     
Files Coverage Δ
chunker/splitting.go 93.75% <85.18%> (-6.25%) ⬇️

... and 15 files with indirect coverage changes

@gammazero gammazero mentioned this pull request Aug 1, 2024
@lidel lidel mentioned this pull request Aug 2, 2024
32 tasks
@gammazero gammazero marked this pull request as draft August 3, 2024 13:07
@gammazero gammazero marked this pull request as ready for review August 6, 2024 01:37
@lidel lidel changed the title Reduce overall chunker memory use fix(unixfs): reduce overall chunker memory use Aug 6, 2024
@lidel lidel changed the title fix(unixfs): reduce overall chunker memory use fix(chunker): reduce overall memory use Aug 6, 2024
@lidel lidel changed the title fix(chunker): reduce overall memory use refactor(chunker): reduce overall memory use Aug 6, 2024
Copy link
Member

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Smoke-tested in ipfs/kubo#10485 as well, lgtm.
Let's see how it behaves in kubo 0.30.0-rc1

@lidel lidel merged commit 3cd3857 into main Aug 21, 2024
@lidel lidel deleted the fix/reduce-chunker-mem branch August 21, 2024 00:19
lidel added a commit to ipfs/kubo that referenced this pull request Aug 21, 2024
lidel added a commit to ipfs/kubo that referenced this pull request Aug 21, 2024
wenyue pushed a commit to wenyue/boxo that referenced this pull request Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants