Memory leak still present in VW 9.11 C# bindings

### Describe the bug

It looks like VW still has issues leaking memory in its C# bindings on the latest release.

My batch training processing script looks like this:

```
private async Task TrainBatchAsync(
        SlimDeckModel[] batch, 
        Dictionary<ColorIdentity, Dictionary<Guid, ColorIdentityCardCounts>> popularities,
        CancellationToken stoppingToken)
    {
        GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
        GC.Collect(GC.MaxGeneration, GCCollectionMode.Aggressive);
        GC.WaitForPendingFinalizers();

        Stopwatch stopwatch = new();

        var exists = File.Exists(Options.Value.VowpalWabbitPath);
        var args = Options.Value.VowpalWabbitArgs + $" -f {Options.Value.VowpalWabbitPath}";
        if (exists)
        {
            Logger.LogInformation("Existing model found, loading.");
            args += $" -i {Options.Value.VowpalWabbitPath}";
        }
        else
        {
            Logger.LogInformation("No existing model found, starting fresh.");
        }

        using var model = new VowpalWabbit(args);

        foreach (var deck in batch)
        {
            var ciCounts = popularities[deck.ColorIdentity];

            if (deck.LastUpdated > UpdatedTime) UpdatedTime = deck.LastUpdated;

            stopwatch.Restart();
            var context = new RecommenderContext(model, deck, ciCounts);
            context.Learn(model, ciCounts);
            stopwatch.Stop();

            stoppingToken.ThrowIfCancellationRequested();

            TotalMs += stopwatch.ElapsedMilliseconds;
            Count++;

            if (Count % 1000 == 0 || Count == Decks.Count) LogProgressUpdate(Count, AvgMs, TotalDecks, TimeSpan.FromMilliseconds(AvgMs * (TotalDecks - Count)));
        }

        Logger.LogInformation("Saving model after pass.");
        model.EndOfPass();
    }
```

For additional context, this is the relevant code of RecommenderContext:

```
using Celani.Magic.Model;
using Celani.Magic.Tools.Common.Extensions;
using MathNet.Numerics;
using MathNet.Numerics.Interpolation;
using System.Collections.Frozen;
using VW;
using VW.Labels;

namespace Celani.Magic.Tools.Learning;

public class RecommenderContext
{
    public void Learn(VowpalWabbit vw, Dictionary<Guid, ColorIdentityCardCounts> colorIdentityCounts)
    {
        HashSet<Guid> usedCardIds = [];

        // Learn from the deck:
        foreach (var (id, hash, value, popularity) in CardHashes)
        {
            usedCardIds.Add(id);

            var weight = Math.Pow(1.0 - popularity, 0.5);

            var label = new SimpleLabel
            {
                Label = 1,
                Weight = (float) Math.Clamp(weight, 0.05, 1.0)
            };

            using var example = BuildExample(vw, id, label);
            vw.Learn(example);
        }

        // Take popular cards from the same color identity that haven't been used yet:
        var popularCi = colorIdentityCounts
            .Take(1000)
            .Where(cc => !usedCardIds.Contains(cc.Key))
            .Shuffle(cc => Math.Pow(cc.Value.Count, 0.75))
            .Select(cc => cc.Key)
            .Take(CardHashes.Length)
            .ToList();

        foreach (var cardId in popularCi)
        {
            usedCardIds.Add(cardId);

            var label = new SimpleLabel
            {
                Label = -1.0f,
                Weight = 1.0f
            };

            using var example = BuildExample(vw, cardId, label);
            vw.Learn(example);
        }

        var randomCi = colorIdentityCounts
            .Where(cc => !usedCardIds.Contains(cc.Key))
            .Shuffle(cc => 1.0)
            .Select(cc => cc.Key)
            .Take(CardHashes.Length / 4)
            .ToList();

        foreach (var cardId in randomCi)
        {
            usedCardIds.Add(cardId);

            var label = new SimpleLabel
            {
                Label = -1.0f,
                Weight = 1.0f
            };

            using var example = BuildExample(vw, cardId, label);
            vw.Learn(example);
        }
    }

    private VowpalWabbitExample BuildExample(VowpalWabbit vw, Guid candidate, ILabel? label = null)
    {
        using var exampleBuilder = new VowpalWabbitExampleBuilder(vw);

        using (var ns = exampleBuilder.AddNamespace(VWHashes.AllCardsNamespace))
        {
            // Deck cards:
            foreach (var (id, hash, value, _) in CardHashes)
            {
                if (id == candidate) continue;
                ns.AddFeature(hash, value);
            }

            var commanderWeight = CardHashes.Length > 90 ? 0.1f : 
                (float) Interpolation.Interpolate(CardHashes.Length);

            // Commander cards:
            foreach (var (id, hash) in CommanderHashes)
            {
                if (id == candidate) continue;
                ns.AddFeature(hash, commanderWeight);
            }
        }

        // Card to predict:
        using (var ns = exampleBuilder.AddNamespace(VWHashes.CardNamespace))
        {
            ns.AddFeature(vw.HashFeature($"c_{candidate}", VWHashes.CardNamespaceHash), 1);
        }

        if (label is not null)
        {
            exampleBuilder.ApplyLabel(label);
        }

        return exampleBuilder.CreateExample();
    }
}
```


Here is the memory usage pattern of this code. Each write is followed by a garbage collection, VW is disposed and reopened:

<img width="1732" height="862" alt="Image" src="https://github.com/user-attachments/assets/d07d74fe-5775-45b4-a6ec-c9eb90aab6e3" />

VW is disposed every batch in an effort to force the system to release the memory taken up by examples. I've confirmed that only one example is actually in the batch pool since it's reused and this training is not in parallel, so I don't know what the source of the actual problem is.

In the last memory usage spike of the graph, I restarted the program entirely. It still loaded the model I was building, but its memory usage reset to the baseline of the original pass.

If it's true that the amount of memory in a VW model is bounded, then I would expect to see its memory usage reset to a baseline over time, especially since I'm forcing an aggressive garbage collection every batch. However, that doesn't seem to be the case, and it seems to linearly scale up over time. Notably, there's a large jump in memory usage the first time the file is written and then reopened.

The model does get bigger with each batch, but not to the point that it should use this much memory.

### How to reproduce

Train a model using the given arguments in the C# bindings, save over passes using EndOfPass(), then dispose and reopen the same model:

--link logistic --loss_function logistic --interactions ac -b 28 --progress 1000 --holdout_off -i file.vw -f file.vw

### Version

9.11

### OS

Linux

### Language

C#

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory leak still present in VW 9.11 C# bindings #4900

Describe the bug

How to reproduce

Version

OS

Language

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Memory leak still present in VW 9.11 C# bindings #4900

Description

Describe the bug

How to reproduce

Version

OS

Language

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions