Skip to content

VowpalWabbitThreadedLearning is almost impossible to use #4911

@michael-celani

Description

@michael-celani

Describe the bug

I'd like to use threaded learning to speed up my model training, but it seems clunky and unintuitive. I'm only able to pass string examples (No ExampleBuilder examples) and no matter what I do, I can't seem to get the dang thing to save.

var settings = new VowpalWabbitSettings(args)
{
    EnableStringExampleGeneration = true,
    ParallelOptions = new ParallelOptions
    {
        MaxDegreeOfParallelism = 8

    },
    ExampleCountPerRun = 2000,
    ExampleDistribution = VowpalWabbitExampleDistribution.RoundRobin
};

using var model = new VowpalWabbitThreadedLearning(settings);

var examples = batch.Select(ToExample);
foreach (var example in examples)
{
    if (!string.IsNullOrWhiteSpace(example))
    {
        model.Learn(example);
    }
}

Logger.LogInformation("Saving model after pass.");

var saveModel = model.SaveModel(Options.Value.VowpalWabbitClassifierPath);
await model.Complete();

await model.SaveModel(Options.Value.VowpalWabbitClassifierPath);
await model.Complete();

I feel like this should work, but I have to buy an engagement ring for the thing. The example count per run has to match up exactly, I have no idea how to force it to synchronize, it's just completely unintuitive -- and this doesn't seem far off from the examples in the documentation. What am I doing wrong here?

How to reproduce

Train with the c# bindings using these options:
--oaa 30 --probabilities --loss_function logistic --progress 1000 --holdout_off

Version

9.11.2

OS

Linux

Language

C#

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugBug in learning semantics, critical by default

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions