Skip to content

Reduce NeighborArray heap memory#14527

Merged
benwtrent merged 15 commits intoapache:mainfrom
weizijun:reduce-NeighborArray-memory
May 27, 2025
Merged

Reduce NeighborArray heap memory#14527
benwtrent merged 15 commits intoapache:mainfrom
weizijun:reduce-NeighborArray-memory

Conversation

@weizijun
Copy link
Contributor

When bbq is used with lucene, one datanode can contain more data.
So when more shards are merged concurrently, there will be a problem of very high heap memory size.
I found that the NeighborArray object was taking up a lot of memory. And I found that the number of nodes always fails to reach maxSize. It only uses about 1/3 or 1/4 of maxSize.
Therefore, I use FloatArrayList\IntArrayList to replace float[]\int[], which can significantly reduce the heap memory usage.

Here is a comparison of the jmap histo results(I set the parameter of m = 64):
before:

 num     #instances         #bytes  class name (module)
-------------------------------------------------------
   1:      11443026     6396808120  [F ([email protected])
   2:      11387631     6129931608  [I ([email protected])
   3:       3265644     1319152760  [B ([email protected])
   4:      11308339      361866848  org.apache.lucene.util.hnsw.NeighborArray ([email protected])
   5:      11134203      267240168  [Lorg.apache.lucene.util.hnsw.NeighborArray; ([email protected])
   6:            77       57916272  [[Lorg.apache.lucene.util.hnsw.NeighborArray; ([email protected])
   7:       2404231       57701544  java.lang.String ([email protected])
   8:         34911       42546120  Ljdk.internal.vm.FillerArray; ([email protected])
   9:        772788       30911520  org.nlpcn.commons.lang.tire.domain.SmartForest
  10:        113758       19111344  org.apache.lucene.codecs.lucene90.blocktree.SegmentTermsEnumFrame ([email protected])
  11:        545656       17460992  java.util.HashMap$Node ([email protected])

after:

num     #instances         #bytes  class name (module)
-------------------------------------------------------
   1:       9228299     1612257464  [F ([email protected])
   2:       9246406     1402537720  [I ([email protected])
   3:       3279264     1141869192  [B ([email protected])
   4:       9124020      364960800  org.apache.lucene.util.hnsw.NeighborArray ([email protected])
   5:       9124036      218976864  org.apache.lucene.internal.hppc.FloatArrayList ([email protected])
   6:       9124036      218976864  org.apache.lucene.internal.hppc.IntArrayList ([email protected])
   7:       8983027      215608448  [Lorg.apache.lucene.util.hnsw.NeighborArray; ([email protected])
   8:       2492594       59822256  java.lang.String ([email protected])
   9:            56       51013776  [[Lorg.apache.lucene.util.hnsw.NeighborArray; ([email protected])
  10:        772788       30911520  org.nlpcn.commons.lang.tire.domain.SmartForest
  11:         68970       28703992  Ljdk.internal.vm.FillerArray; ([email protected])

The avg size of float[] is 559 before.
The avg size of float[] is 174 after.

The avg size of int[] is 538 before.
The avg size of int[] is 151 after.

I tests some dataset like GIST 100K vectors, 960 dimensions\LAION 100M vectors, 768 dimensions. They have similar conclusions.
I haven't tested the performance very rigorously. It seems that this modification has no impact on performance.

@weizijun
Copy link
Contributor Author

The TestHnswFloatVectorGraph.testRamUsageEstimate maybe failed, because the OnHeapHnswGraph.ramBytesUsed use the fixed array size to calculate the ram value.

@benwtrent
Copy link
Member

We need to make sure that there are no significant performance or concurrency bugs introduced with this. Could you test with https://github.com/mikemccand/luceneutil to verify recall, multi-threaded merge, etc. ?

As for the ram usage test, it should be fixable by figuring out the number of connections per node and giving a valid range (e.g. RAM usage is more than X but less than Y).

@jainankitk
Copy link
Contributor

We need to make sure that there are no significant performance or concurrency bugs introduced with this. Could you test with https://github.com/mikemccand/luceneutil to verify recall, multi-threaded merge, etc. ?

+1 on the performance. Using IntArrayList instead of int[] can impact the performance due to the overhead of copying while growing the array.

Although, I am not completely sure if this change should cause any concurrency bugs compared to original. @benwtrent - Do you have anything specific in mind, that I might be missing?

@benwtrent
Copy link
Member

Do you have anything specific in mind, that I might be missing?

I am just being cautious. I know we provide locking, etc. for the concurrent graph builder, but maybe its taking advantage of arrays never growing by accident.

I hate concurrent code, and it always makes me paranoid that we will miss something :).

@weizijun
Copy link
Contributor Author

weizijun commented Apr 22, 2025

I tested this case in elasticsearch and it worked fine.Probably because in elasticsearch, the merge of graph builders is in one merge thread.
I will tests more case that use the concurrent graph builder.

@jainankitk
Copy link
Contributor

I am just being cautious. I know we provide locking, etc. for the concurrent graph builder, but maybe its taking advantage of arrays never growing by accident.

Thanks for elaborating. That makes sense!

I hate concurrent code, and it always makes me paranoid that we will miss something :).

While I don't hate concurrent code, I do share your paranoia regarding missing something! :)

@weizijun
Copy link
Contributor Author

We need to make sure that there are no significant performance or concurrency bugs introduced with this. Could you test with https://github.com/mikemccand/luceneutil to verify recall, multi-threaded merge, etc. ?

hi, @benwtrent , could you tell me which benchmark tool can test this case?

@benwtrent
Copy link
Member

@weizijun lucene-util is the benchmark tool I linked. knnPerfTest is the script.

@weizijun
Copy link
Contributor Author

lucene-util is the benchmark tool I linked. knnPerfTest is the script.

yeah, I know the tool, I don't know how to test knn in it. The knnPerfTest only test the lucene_candidate, I modify some code to test.
the lucene_baseline result(before modify):

Results:
recall  latency(ms)    nDoc  topK  fanout  maxConn  beamWidth  quantized  index(s)  index_docs/s  num_segments  index_size(MB)  vec_disk(MB)  vec_RAM(MB)  indexType
 0.796       10.284  500000   100      50       64        250         no    180.51       2769.99             8         1501.40      1464.844     1464.844       HNSW

the lucene_candidate result(after modify):

Results:
recall  latency(ms)    nDoc  topK  fanout  maxConn  beamWidth  quantized  index(s)  index_docs/s  num_segments  index_size(MB)  vec_disk(MB)  vec_RAM(MB)  indexType
 0.820       10.712  500000   100      50       64        250         no    187.26       2670.16             8         1501.46      1464.844     1464.844       HNSW

@benwtrent
Copy link
Member

@weizijun I am running some benchmarking as well. But the key thing is to update the parameters in knnPerfTest.py making sure numMergeWorker and numMergeThread are greater than 1 ,etc.

@weizijun
Copy link
Contributor Author

@benwtrent I see the default parameters:

  "numMergeWorker": (12,),
  "numMergeThread": (4,),

Is the current merger effective with these parameters?

Copy link
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran some benchmarking and this seems OK and doesn't seem to break multi-threadedness.

I left a comment on the underlying structures used.

Please update CHANGES.txt under 10.3 optimizations for this nice optimization! It will be very nice to have better heap utilization on graph building!

Comment on lines +43 to +44
nodes = new IntArrayList();
scores = new FloatArrayList();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two things, I think these should be initialized to something like maxSize/4 or maxSize/8.

Additionally, the array lists should enforce the max size and ensure the underlying buffer does not get bigger than the expected max. I think you can do this pretty simply by having something like MaxSizedIntArrayList that accepts an init & max size parameters in its ctor, and overrides ensureBufferSpace so that it:

  • disallows growth past maxSize
  • Instead of doing ArrayUtil.grow it does ArrayUtil.growInRange(buffer, elementsCount + expectedAdditions, maxSize)

This will prevent overallocation of the underlying buffer and enforce max size limitations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@weizijun
Copy link
Contributor Author

I left a comment on the underlying structures used.

Please update CHANGES.txt under 10.3 optimizations for this nice optimization! It will be very nice to have better heap utilization on graph building!

Ok, and about the OnHeapHnswGraph.ramBytesUsed, I didn't change to code. But it maybe need to change the logic.

@benwtrent
Copy link
Member

Ok, and about the OnHeapHnswGraph.ramBytesUsed, I didn't change to code. But it maybe need to change the logic.

I think utilizing the underlying array estimations is what we need and the testing needs to be updated to assert within a valid range.

@msokolov
Copy link
Contributor

re: concurrency; in theory it should be safe since we lock a row before inserting anything into it. Consider that even with fixed-size arrays we need to track the occupancy (how full the array is) in a thread-safe way.

@weizijun
Copy link
Contributor Author

I added the change log and changed the initial size of the list to maxSize/8

@weizijun
Copy link
Contributor Author

The failed TestHnswFloatVectorGraph.testRamUsageEstimate detail:

The seed is -Dtests.seed=FEF2FF50C6824DEF

Expected :1723096.0
Actual   :3785393.0
<Click to see difference>

java.lang.AssertionError: expected:<1723096.0> but was:<3785393.0>
	at __randomizedtesting.SeedInfo.seed([FEF2FF50C6824DEF:3BB69B10FD099A58]:0)
	at org.junit.Assert.fail(Assert.java:89)
	at org.junit.Assert.failNotEquals(Assert.java:835)
	at org.junit.Assert.assertEquals(Assert.java:555)
	at org.junit.Assert.assertEquals(Assert.java:685)
	at org.apache.lucene.util.hnsw.HnswGraphTestCase.testRamUsageEstimate(HnswGraphTestCase.java:804)
	at org.apache.lucene.util.hnsw.TestHnswFloatVectorGraph.testRamUsageEstimate(TestHnswFloatVectorGraph.java:39)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1763)

If I change the ramBytesUsed of OnHeapHnswGraph to use the real memory of each NeighborArray, the performance will drop a lot.

* main: (51 commits)
  Fix ECJ compiler config for Java 24 (also affects Eclipse IDE)
  Correct shebang in gradlew. apache#14592
  Overwrite gradlew scripts with up-to-date ones and apply Lucene customizations. (apache#14592)
  Remove abstractions of MMapDirectory and its "fake" MR-JAR support and move it to main sourceSet (vectors untouched) (apache#14564)
  deps(java): bump org.gradle.toolchains.foojay-resolver-convention (apache#14573)
  deps(java): bump com.gradle.develocity from 3.18.2 to 4.0.1 (apache#14585)
  deps(java): bump nl.littlerobots.version-catalog-update (apache#14568)
  Bump expected base Java version to 24. apache#14533
  Create file open hints on IOContext to replace ReadAdvice (apache#14482)
  deps(java): bump com.github.ben-manes.versions from 0.51.0 to 0.52.0 (apache#14574)
  deps(java): bump org.owasp.dependencycheck from 7.2.0 to 12.1.1 (apache#14584)
  deps(java): bump org.eclipse.jgit:org.eclipse.jgit (apache#14579)
  deps(java): bump com.diffplug.spotless from 6.9.1 to 7.0.3 (apache#14588)
  deps(java): bump com.gradle.common-custom-user-data-gradle-plugin (apache#14587)
  deps(java): bump net.ltgt.errorprone from 4.1.0 to 4.2.0 (apache#14567)
  deps(java): bump org.locationtech.jts:jts-core from 1.17.0 to 1.20.0 (apache#14586)
  deps(java): bump org.apache.opennlp:opennlp-tools from 2.5.3 to 2.5.4 (apache#14580)
  deps(java): bump asm from 9.6 to 9.8 (apache#14578)
  deps(java): bump commons-codec:commons-codec from 1.17.2 to 1.18.0 (apache#14581)
  deps(java): bump net.java.dev.javacc:javacc from 7.0.12 to 7.0.13 (apache#14576)
  ...

# Conflicts:
#	lucene/CHANGES.txt
@benwtrent
Copy link
Member

@weizijun

It should be fixed. Having an estimation that is more than 2x off is pretty bad.

this estimation is used to determine how often flushes should occur, etc.

There are a couple of ways this can be fixed.

A simple way could be providing an optional call-back to NeighborArray that accesses package-private method on OnHeapHnswGraph that allows for their individual estimation to be adjusted during array growth.

NeighborArray(OnHeapHnswGraph::updateEstimate...) or something. Then the ram estimation in OnHeapHnswGraph becomes the accumulation of those estimates as the inner estimates evolve.

We need to be cautious there with multi-threadedness as many node updates could be occuring at a time. So, likely this inner accumulator needs to be LongAccumulator and it should also assert that its always a positive number

Please also adjust the inner arrays to enforce their maximal length. This way we never over-allocate.

@weizijun
Copy link
Contributor Author

weizijun commented May 6, 2025

hi, @benwtrent, I updated the ramBytesUsed method, the memory is correct now.

Please also adjust the inner arrays to enforce their maximal length. This way we never over-allocate.

I use the maxSize variable to ensure that the elementsCount of the array cannot exceed the maxSize value.

@benwtrent
Copy link
Member

I add the MaxSizedIntArrayList class to solve oversize problem.

You need one for scores as well. That MaxSizedIntArrayList looks good :)

@weizijun
Copy link
Contributor Author

weizijun commented May 8, 2025

You need one for scores as well. That MaxSizedIntArrayList looks good :)

I didn't add the MaxSizedFloatArrayList because the int array will be passed out via the nodes() method, but the float array of the scores will not be passed out. Do I need to change FloatArrayList to MaxSizedFloatArrayList?

@benwtrent
Copy link
Member

but the float array of the scores will not be passed out. Do I need to change FloatArrayList to MaxSizedFloatArrayList?

I don't understand this. How does the FloatArrayList bypass the overallocation possible via ArrayUtils.grow? Its the same structure right? and for every node there is a score.

@weizijun
Copy link
Contributor Author

weizijun commented May 8, 2025

I don't understand this. How does the FloatArrayList bypass the overallocation possible via ArrayUtils.grow? Its the same structure right? and for every node there is a score.

maxSize controls the total number of array values, so if the length of ArrayUtils.grow exceeds maxSize, the actual value will not exceed maxSize.

However, the nodes() method of NeighborArray returns nodes.buffer, so maxSize cannot control the int array. Therefore, a MaxSizedIntArrayList class is needed to ensure that the length of the int array can overflow maxSize.

@benwtrent
Copy link
Member

maxSize controls the total number of array values, so if the length of ArrayUtils.grow exceeds maxSize, the actual value will not exceed maxSize.

That is frankly not true for FloatArrayList. ArrayUtils.grow can easily cause the underlying buffer to exceed the length.

I don't understand the argument against providing actual max size restricted float array list. Is there a reason to not do this?

weizijun added 4 commits May 15, 2025 14:27
* main: (31 commits)
  Fix termination condition in TestStressNRTReplication. (apache#14665)
  deps(java): bump com.gradle.develocity from 3.19 to 3.19.2 (apache#14662)
  Build: remove hard-coded Java versions from ecj.javadocs.prefs (apache#14651)
  Update verifier comment to show label (apache#14658)
  Catch and re-throw Throwable rather than using a success boolean (apache#14633)
  Mention label in changelog verifier comment (apache#14656)
  Enable PR actions in changelog verifier (apache#14644)
  Fix FuzzySet#getEstimatedNumberUniqueValuesAllowingForCollisions to properly account for hashCount (apache#14614)
  Don't perform additional KNN querying after timeout, fixes apache#14639 (apache#14640)
  Add instructions to help/IDEs.txt for VSCode and Neovim (apache#14646)
  build(deps): bump ruff from 0.11.7 to 0.11.8 in /dev-tools/scripts (apache#14603)
  deps(java): bump de.jflex:jflex from 1.8.2 to 1.9.1 (apache#14583)
  Use the preload hint on completion fields and memory terms dictionaries. (apache#14634)
  Clean up FileTypeHint a bit. (apache#14635)
  Expressions: Improve test to use a fully private class or method
  Remove deprecations in expressions (apache#14641)
  removing constructor with deprecated attribute 'onlyLongestMatch (apache#14356)
  Moving CHANGES entry for apache#14609 from 11.0 to 10.3 (apache#14638)
  Overrides rewrite in PointRangeQuery to optimize AllDocs/NoDocs cases (apache#14609)
  Adding benchmark for histogram collector over point range query (apache#14622)
  ...

# Conflicts:
#	lucene/CHANGES.txt
@github-actions github-actions bot added this to the 10.3.0 milestone May 16, 2025
@weizijun
Copy link
Contributor Author

That is frankly not true for FloatArrayList. ArrayUtils.grow can easily cause the underlying buffer to exceed the length.

I don't understand the argument against providing actual max size restricted float array list. Is there a reason to not do this?

@benwtrent Okay, I added MaxSizedFloatArrayList. Can you review the PR again?

* main: (32 commits)
  update os.makedirs with pathlib mkdir (apache#14710)
  Optimize AbstractKnnVectorQuery#createBitSet with intoBitset (apache#14674)
  Implement #docIDRunEnd() on PostingsEnum. (apache#14693)
  Speed up TermQuery (apache#14709)
  Refactor main top-n bulk scorers to evaluate hits in a more term-at-a-time fashion. (apache#14701)
  Fix WindowsFS test failure seen on Policeman Jenkins (apache#14706)
  Use a temporary repository location to download certain ecj versions ("drops") (apache#14703)
  Add assumption to ignore occasional test failures due to disconnected graphs (apache#14696)
  Return MatchNoDocsQuery when IndexOrDocValuesQuery::rewrite does not match (apache#14700)
  Minor access modifier adjustment to a couple of lucene90 backward compat types (apache#14695)
  Speed up exhaustive evaluation. (apache#14679)
  Specify and test that IOContext is immutable (apache#14686)
  deps(java): bump org.gradle.toolchains.foojay-resolver-convention (apache#14691)
  deps(java): bump org.eclipse.jgit:org.eclipse.jgit (apache#14692)
  Clean up how the test framework creates asserting scorables. (apache#14452)
  Make competitive iterators more robust. (apache#14532)
  Remove DISIDocIdStream. (apache#14550)
  Implement AssertingPostingsEnum#intoBitSet. (apache#14675)
  Fix patience knn queries to work with seeded knn queries (apache#14688)
  Added toString() method to BytesRefBuilder (apache#14676)
  ...
Copy link
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great!

One small comment on updating the .gitignore. But once that is reverted to be the same as on main, then I can merge and backport.

Thank you for all the iterations and such!

.gitignore Outdated
Comment on lines +35 to +41

# Java class files
*.class

# Ignore bin directories
bin/
**/bin/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you think these need updated, could you do it in a separate PR? I would like to keep this change restricted to HNSW.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry, that was a mistake, I'll delete it.

Copy link
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great stuff! I will merge and backport.

@benwtrent benwtrent merged commit 84b357c into apache:main May 27, 2025
7 checks passed
benwtrent pushed a commit that referenced this pull request May 27, 2025
* use FloatArrayList\IntArrayList to replace float[]\int[]

* use getScores(int i) to replace scores()

* add more tests

* add change log and change the init value

* update OnHeapHnswGraph ramBytesUsed method

* improve

* add MaxSizedIntArrayList

* add MaxSizedFloatArrayList

* add MaxSizedFloatArrayList

* fixed tests

* revert
@weizijun
Copy link
Contributor Author

Here are the statistics of 100w hnsw graphs, with m = 16 and ef = 100:
Level count = 5:

level: 0, node count: 1000000
level: 1, node count: 62835
level: 2, node count: 3926
level: 3, node count: 235
level: 4, node count: 12

Average number of neighbors per level:

level: 0, avg neighbor count: 8.026909
level: 1, avg neighbor count: 7.539985676772499
level: 2, avg neighbor count: 8.596535914416709
level: 3, avg neighbor count: 8.353191489361702
level: 4, avg neighbor count: 3.8333333333333335

The detail of neighbor count:
level: 0

level: 0, neighbor count: 1, node count: 141664
level: 0, neighbor count: 2, node count: 130484
level: 0, neighbor count: 3, node count: 111485
level: 0, neighbor count: 4, node count: 91141
level: 0, neighbor count: 5, node count: 72929
level: 0, neighbor count: 6, node count: 59030
level: 0, neighbor count: 7, node count: 47796
level: 0, neighbor count: 8, node count: 39864
level: 0, neighbor count: 9, node count: 33320
level: 0, neighbor count: 10, node count: 27923
level: 0, neighbor count: 11, node count: 23972
level: 0, neighbor count: 12, node count: 20777
level: 0, neighbor count: 13, node count: 17986
level: 0, neighbor count: 14, node count: 15510
level: 0, neighbor count: 15, node count: 13725
level: 0, neighbor count: 16, node count: 12296
level: 0, neighbor count: 17, node count: 10947
level: 0, neighbor count: 18, node count: 9826
level: 0, neighbor count: 19, node count: 8765
level: 0, neighbor count: 20, node count: 7947
level: 0, neighbor count: 21, node count: 7348
level: 0, neighbor count: 22, node count: 6639
level: 0, neighbor count: 23, node count: 6045
level: 0, neighbor count: 24, node count: 5413
level: 0, neighbor count: 25, node count: 5101
level: 0, neighbor count: 26, node count: 4569
level: 0, neighbor count: 27, node count: 4105
level: 0, neighbor count: 28, node count: 3965
level: 0, neighbor count: 29, node count: 3564
level: 0, neighbor count: 30, node count: 3330
level: 0, neighbor count: 31, node count: 3019
level: 0, neighbor count: 32, node count: 49515

level: 1

level: 1, neighbor count: 1, node count: 6760
level: 1, neighbor count: 2, node count: 6707
level: 1, neighbor count: 3, node count: 6127
level: 1, neighbor count: 4, node count: 5277
level: 1, neighbor count: 5, node count: 4420
level: 1, neighbor count: 6, node count: 3805
level: 1, neighbor count: 7, node count: 3321
level: 1, neighbor count: 8, node count: 2827
level: 1, neighbor count: 9, node count: 2502
level: 1, neighbor count: 10, node count: 2093
level: 1, neighbor count: 11, node count: 1849
level: 1, neighbor count: 12, node count: 1645
level: 1, neighbor count: 13, node count: 1521
level: 1, neighbor count: 14, node count: 1257
level: 1, neighbor count: 15, node count: 1163
level: 1, neighbor count: 16, node count: 11561

level: 2

level: 2, neighbor count: 1, node count: 298
level: 2, neighbor count: 2, node count: 302
level: 2, neighbor count: 3, node count: 309
level: 2, neighbor count: 4, node count: 278
level: 2, neighbor count: 5, node count: 267
level: 2, neighbor count: 6, node count: 251
level: 2, neighbor count: 7, node count: 196
level: 2, neighbor count: 8, node count: 209
level: 2, neighbor count: 9, node count: 178
level: 2, neighbor count: 10, node count: 159
level: 2, neighbor count: 11, node count: 153
level: 2, neighbor count: 12, node count: 134
level: 2, neighbor count: 13, node count: 125
level: 2, neighbor count: 14, node count: 75
level: 2, neighbor count: 15, node count: 106
level: 2, neighbor count: 16, node count: 886

level: 3

level: 3, neighbor count: 1, node count: 18
level: 3, neighbor count: 2, node count: 14
level: 3, neighbor count: 3, node count: 11
level: 3, neighbor count: 4, node count: 14
level: 3, neighbor count: 5, node count: 17
level: 3, neighbor count: 6, node count: 20
level: 3, neighbor count: 7, node count: 19
level: 3, neighbor count: 8, node count: 11
level: 3, neighbor count: 9, node count: 23
level: 3, neighbor count: 10, node count: 12
level: 3, neighbor count: 11, node count: 12
level: 3, neighbor count: 12, node count: 9
level: 3, neighbor count: 13, node count: 7
level: 3, neighbor count: 14, node count: 10
level: 3, neighbor count: 15, node count: 4
level: 3, neighbor count: 16, node count: 34

level: 4

level: 4, neighbor count: 1, node count: 1
level: 4, neighbor count: 2, node count: 2
level: 4, neighbor count: 3, node count: 5
level: 4, neighbor count: 5, node count: 2
level: 4, neighbor count: 8, node count: 2

@weizijun weizijun deleted the reduce-NeighborArray-memory branch May 28, 2025 02:43
@jainankitk
Copy link
Contributor

Thanks @benwtrent and @weizijun for seeing this through!

@ChrisHegarty ChrisHegarty modified the milestones: 10.3.0, 9.12.2 Jun 11, 2025
benwtrent pushed a commit that referenced this pull request Jun 12, 2025
* use FloatArrayList\IntArrayList to replace float[]\int[]

* use getScores(int i) to replace scores()

* add more tests

* add change log and change the init value

* update OnHeapHnswGraph ramBytesUsed method

* improve

* add MaxSizedIntArrayList

* add MaxSizedFloatArrayList

* add MaxSizedFloatArrayList

* fixed tests

* revert
benwtrent pushed a commit that referenced this pull request Jun 12, 2025
* use FloatArrayList\IntArrayList to replace float[]\int[]

* use getScores(int i) to replace scores()

* add more tests

* add change log and change the init value

* update OnHeapHnswGraph ramBytesUsed method

* improve

* add MaxSizedIntArrayList

* add MaxSizedFloatArrayList

* add MaxSizedFloatArrayList

* fixed tests

* revert
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants