Skip to content

Azure AI Search: Add support for batching save operations#438

Merged
dluc merged 2 commits intomicrosoft:mainfrom
marcominerva:aisearch-batch
May 11, 2024
Merged

Azure AI Search: Add support for batching save operations#438
dluc merged 2 commits intomicrosoft:mainfrom
marcominerva:aisearch-batch

Conversation

@marcominerva
Copy link
Copy Markdown
Contributor

@marcominerva marcominerva commented Apr 26, 2024

Motivation and Context (Why the change? What's the scenario?)

Add support for batch support in Azure AI Search and SaveRecordsHandler.

High level description (Approach, Design)

  • New KernelMemory:DataIngestion:MemoryDbUpsertBatchSize setting (default: 1) to manage memory records batch size.
  • Implement the new optional IMemoryDbBatchUpsert interface in AzureAISearchMemory.
  • Fix old bug in AzureAISearchMemory occurring when EmbeddingGenerationEnabled is false.

@marcominerva marcominerva requested a review from dluc as a code owner April 26, 2024 09:26
@dluc dluc changed the title Add support for batching save operations Azure AI Search: Add support for batching save operations Apr 26, 2024
Copy link
Copy Markdown
Collaborator

@dluc dluc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few asks:

  • We need to ship the Abstractions nuget with the new interface, before we can use it in other places. We'll need a separate PR focused only on the "Abstractions" project changes, merge it, release the nuget, etc.

  • I would prefer splitting the two interfaces, not using inheritance, e.g. class AzureAISearchMemory : IMemoryDb, IMemoryDbBatchUpsert

  • 'UpsertAsync' can be refactored to use BatchUpsertAsync, avoiding duplicate code

  • I would rename the new interface to IMemoryDbBatchUpsert and the method to BatchUpsertAsync

  • I believe some code is missing in the handler, see comment

@dluc dluc added the waiting for author Waiting for author to reply or address comments label Apr 26, 2024
@dluc
Copy link
Copy Markdown
Collaborator

dluc commented Apr 27, 2024

I added and released the new interface, you should be able to move forward with this PR without creating another one

@marcominerva marcominerva requested a review from dluc April 29, 2024 08:36
@marcominerva marcominerva requested a review from dluc April 30, 2024 07:57
@dluc
Copy link
Copy Markdown
Collaborator

dluc commented May 1, 2024

next steps:

  • rebase on latest main (I tried and there's a lot of conflicts, so I'll need more time)
  • check exceptions handling
  • manual tests

@dluc dluc added ready for review and removed waiting for author Waiting for author to reply or address comments labels May 1, 2024
commit c0a79a5
Merge: 152e827 0a046ed
Author: Devis Lucato <[email protected]>
Date:   Thu May 2 20:45:03 2024 -0700

    Merge branch 'main' into aisearch-batch

commit 152e827
Author: Marco Minerva <[email protected]>
Date:   Tue Apr 30 09:56:50 2024 +0200

    Refactor UpsertAsync method

commit 4ec58e7
Author: Devis Lucato <[email protected]>
Date:   Mon Apr 29 11:23:47 2024 -0700

    Update service/Core/Handlers/SaveRecordsHandler.cs

commit e8c763d
Author: Marco Minerva <[email protected]>
Date:   Mon Apr 29 10:34:17 2024 +0200

    Fix

commit 1ead2ea
Merge: 506df3f 128209e
Author: Marco Minerva <[email protected]>
Date:   Mon Apr 29 10:15:00 2024 +0200

    Merge commit

commit 506df3f
Author: Marco Minerva <[email protected]>
Date:   Mon Apr 29 10:12:51 2024 +0200

    Fix

commit 87bf9f4
Author: Marco Minerva <[email protected]>
Date:   Mon Apr 29 10:10:56 2024 +0200

    Refactoring and complete implementation

commit e9cb917
Merge: 34e0ad2 a5adf33
Author: Marco Minerva <[email protected]>
Date:   Mon Apr 29 09:33:02 2024 +0200

    Merge branch 'main' into aisearch-batch

commit a5adf33
Merge: d74217a b6a2be5
Author: Marco Minerva <[email protected]>
Date:   Mon Apr 29 07:29:45 2024 +0000

    Merge branch 'microsoft:main' into main

commit 128209e
Author: Marco Minerva <[email protected]>
Date:   Fri Apr 26 11:32:19 2024 +0200

    Make AzureAISearchMemory implementing IBatchMemoryDb

commit 1477969
Author: Marco Minerva <[email protected]>
Date:   Fri Apr 26 11:16:58 2024 +0200

    Add support for batching save operations

commit 180c890
Author: Marco Minerva <[email protected]>
Date:   Wed Apr 24 09:36:59 2024 +0200

    Use document batching

commit d74217a
Author: Marco Minerva <[email protected]>
Date:   Fri Apr 26 11:40:31 2024 +0200

    Remove reduntant method calling

commit 34e0ad2
Author: Marco Minerva <[email protected]>
Date:   Fri Apr 26 11:32:19 2024 +0200

    Make AzureAISearchMemory implementing IBatchMemoryDb

commit 8a5f2db
Author: Marco Minerva <[email protected]>
Date:   Fri Apr 26 11:16:58 2024 +0200

    Add support for batching save operations

commit c96f26e
Merge: e3488e7 0d82c62
Author: Marco Minerva <[email protected]>
Date:   Fri Apr 26 09:44:56 2024 +0200

    Merge branch 'main' into aisearch-batch

commit e3488e7
Author: Marco Minerva <[email protected]>
Date:   Wed Apr 24 09:36:59 2024 +0200

    Use document batching
@dluc dluc force-pushed the aisearch-batch branch from c0a79a5 to d8d8185 Compare May 10, 2024 05:39
* reduce IO in case of transient errors
* remove code duplication/complexity
* fix old bug with EmbeddingGenerationEnabled=false
@dluc dluc merged commit bde65d7 into microsoft:main May 11, 2024
@marcominerva marcominerva deleted the aisearch-batch branch May 13, 2024 07:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants