Commit 467f70e
Consolidate GPU IVF query tile calculation + special handling for large query memory requirements
Summary:
In the GPU IVF (Flat, SQ and PQ) code, there is a requirement for using temporary memory for storing unfiltered (or partially filtered) vector distances calculated during list scanning which are k-selected by separate kernels.
While a batch query may be presented to an IVF index, the amount of temporary memory needed to store all these unfiltered distances prior to filtering may be very huge depending upon IVF characteristics (such as the maximum number of vectors encoded in any of the IVF lists), in which case we cannot process the entire batch of queries at once and instead must tile over the batch of queries to reuse the temporary memory that we make available for these distances.
The old code duplicated this roughly equivalent logic in 3 different places (the IVFFlat/SQ code, IVFPQ with precomputed codes, and IVFPQ without precomputed codes). Furthermore, in the case where either little/no temporary memory was available or where what temporary memory was available was (vastly) exceeded by the amount needed to handle a particular query, the old code enforced a minimum number of queries to be processed at once of 8. In certain cases (huge IVF list imbalance), this memory request could exceed the amount of memory that can be safely allocated on a GPU.
This diff consolidates the original 3 separate places where this calculation took place to 1 place in IVFUtils. The logic proceeds roughly as before, to figure out how many queries can be processed in the available temporary memory, except we add a new heuristic in the case where the number of queries that can be concurrently processed falls below 8. This could be either due to little temporary memory being available, or due to huge memory requirements. In this case, we instead ignore the amount of temporary memory available and instead see how many queries' memory requirements would fit into a single 512 MiB memory allocation, so we reasonably cap this amount. If the query still cannot be satisfied with this allocation, we still proceed executing 1 query at a time (which note could still potentially exhaust the GPU memory, but this is an error that is unavoidable).
While a different heuristic using the amount of actual memory allocatable on the device could be used instead of this fixed 512 MiB amount, there is no guarantee to my knowledge that a single cudaMalloc up to this limit could succeed (e.g., GPU reports 3 GiB available, you attempt to allocate all of that in a single allocation), so we just pick an amount which is a reasonable balance between efficiency (parallelism) and memory consumption. Note that if not enough temporary memory is available and a single 512 MiB allocation fails, then there is likely little memory to proceed efficiently at all under any scenario, as Faiss does require some headroom in terms of memory available for scratch spaces.
Reviewed By: mdouze
Differential Revision: D45574455
fbshipit-source-id: 08f5204e3e9656627c9134d7409b9b0960f07b2d1 parent 411c172 commit 467f70e
File tree
5 files changed
+185
-88
lines changed- faiss/gpu/impl
5 files changed
+185
-88
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
345 | 345 | | |
346 | 346 | | |
347 | 347 | | |
348 | | - | |
349 | | - | |
350 | | - | |
351 | | - | |
352 | 348 | | |
353 | 349 | | |
354 | 350 | | |
| |||
359 | 355 | | |
360 | 356 | | |
361 | 357 | | |
| 358 | + | |
| 359 | + | |
362 | 360 | | |
363 | 361 | | |
364 | 362 | | |
365 | 363 | | |
366 | 364 | | |
367 | 365 | | |
368 | | - | |
369 | | - | |
370 | | - | |
371 | | - | |
372 | | - | |
373 | | - | |
374 | | - | |
375 | | - | |
| 366 | + | |
| 367 | + | |
376 | 368 | | |
377 | | - | |
378 | | - | |
379 | | - | |
380 | | - | |
381 | | - | |
382 | | - | |
383 | | - | |
384 | | - | |
385 | | - | |
386 | | - | |
387 | | - | |
388 | | - | |
389 | | - | |
390 | | - | |
391 | | - | |
392 | | - | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
| 372 | + | |
| 373 | + | |
393 | 374 | | |
394 | 375 | | |
395 | 376 | | |
| |||
428 | 409 | | |
429 | 410 | | |
430 | 411 | | |
| 412 | + | |
431 | 413 | | |
432 | 414 | | |
433 | 415 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
21 | 129 | | |
22 | 130 | | |
23 | 131 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
22 | 55 | | |
23 | 56 | | |
24 | 57 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
550 | 550 | | |
551 | 551 | | |
552 | 552 | | |
553 | | - | |
554 | | - | |
555 | | - | |
556 | | - | |
557 | 553 | | |
558 | 554 | | |
559 | 555 | | |
| |||
566 | 562 | | |
567 | 563 | | |
568 | 564 | | |
| 565 | + | |
| 566 | + | |
569 | 567 | | |
570 | 568 | | |
571 | 569 | | |
572 | 570 | | |
573 | 571 | | |
574 | 572 | | |
575 | | - | |
576 | | - | |
577 | | - | |
578 | | - | |
579 | | - | |
580 | | - | |
581 | | - | |
582 | | - | |
583 | | - | |
584 | | - | |
585 | | - | |
586 | | - | |
587 | | - | |
588 | | - | |
589 | | - | |
590 | | - | |
591 | | - | |
592 | | - | |
593 | | - | |
594 | | - | |
595 | | - | |
596 | | - | |
597 | | - | |
598 | | - | |
599 | | - | |
600 | | - | |
601 | | - | |
| 573 | + | |
| 574 | + | |
| 575 | + | |
| 576 | + | |
| 577 | + | |
| 578 | + | |
| 579 | + | |
| 580 | + | |
| 581 | + | |
| 582 | + | |
| 583 | + | |
| 584 | + | |
| 585 | + | |
| 586 | + | |
602 | 587 | | |
603 | 588 | | |
604 | 589 | | |
| |||
664 | 649 | | |
665 | 650 | | |
666 | 651 | | |
| 652 | + | |
667 | 653 | | |
668 | 654 | | |
669 | 655 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
562 | 562 | | |
563 | 563 | | |
564 | 564 | | |
565 | | - | |
566 | | - | |
567 | | - | |
568 | | - | |
569 | 565 | | |
570 | 566 | | |
571 | 567 | | |
| |||
578 | 574 | | |
579 | 575 | | |
580 | 576 | | |
| 577 | + | |
| 578 | + | |
581 | 579 | | |
582 | 580 | | |
583 | 581 | | |
584 | 582 | | |
585 | 583 | | |
586 | 584 | | |
587 | | - | |
588 | | - | |
589 | | - | |
590 | | - | |
591 | | - | |
592 | | - | |
593 | | - | |
594 | | - | |
595 | | - | |
596 | | - | |
597 | | - | |
598 | | - | |
599 | | - | |
600 | | - | |
601 | | - | |
602 | | - | |
603 | | - | |
604 | | - | |
605 | | - | |
606 | | - | |
607 | | - | |
608 | | - | |
609 | | - | |
610 | | - | |
611 | | - | |
| 585 | + | |
| 586 | + | |
| 587 | + | |
| 588 | + | |
| 589 | + | |
| 590 | + | |
| 591 | + | |
| 592 | + | |
| 593 | + | |
| 594 | + | |
| 595 | + | |
| 596 | + | |
| 597 | + | |
| 598 | + | |
612 | 599 | | |
613 | 600 | | |
614 | 601 | | |
| |||
647 | 634 | | |
648 | 635 | | |
649 | 636 | | |
| 637 | + | |
650 | 638 | | |
651 | 639 | | |
652 | 640 | | |
| |||
0 commit comments