Use tmtoolkit to fit multiple LDA models in parallel. by SeppeDeWinter · Pull Request #18 · aertslab/TF-MINDI

SeppeDeWinter · 2025-10-08T06:23:08Z

Using this change it is easier for a user to explore multiple parameters for LDA modeling.

The function run_topic_modeling now accepts a list of values for

n_topics
alpha
eta

When a list of values is given for one or more of these parameters multiple models will be fit in parallel to allow the user to explore the most optimal hyperparameters. This is done under the hood using tmtoolkit. The function now returns a list of topic models along with quality metrics.

After a model has been selected this can be added to the AnnData object using the new function add_topic_modeling_result.

tmtoolkit provides multiple functionalities to evaluate topic models, see https://tmtoolkit.readthedocs.io/en/latest/topic_modeling.html#Evaluation-of-topic-models.

For this reason the loglikelihood function is removed given that it is already implement in tmtoolkit.

This change does introduce a new dependency. We could consider making the topic modeling dependencies optional given that it is a more advanced use case.

pip install tfmindi[topic] for instance.

LukasMahieu · 2025-10-08T07:49:14Z

Okay, looks interesting, will have to test this.
We already have an "evaluate_topic_models", which does something similar but unoptimized and only for a range over n_topics. Maybe this functionality fits better there?

SeppeDeWinter · 2025-10-08T08:15:33Z

Okay, looks interesting, will have to test this. We already have an "evaluate_topic_models", which does something similar but unoptimized and only for a range over n_topics. Maybe this functionality fits better there?

True! That way we don't have to change the API.
Not sure how easy it is to automatically detect the optimal parameters though, although I also have not fully explored how strong the results differ between parameters.

Use tmtoolkit to fit multiple LDA models in parallel.

6ae348d

SeppeDeWinter requested a review from LukasMahieu October 8, 2025 06:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use tmtoolkit to fit multiple LDA models in parallel.#18

Use tmtoolkit to fit multiple LDA models in parallel.#18
SeppeDeWinter wants to merge 1 commit intomainfrom
update-topic_modeling

SeppeDeWinter commented Oct 8, 2025

Uh oh!

LukasMahieu commented Oct 8, 2025

Uh oh!

SeppeDeWinter commented Oct 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

SeppeDeWinter commented Oct 8, 2025

Uh oh!

LukasMahieu commented Oct 8, 2025

Uh oh!

SeppeDeWinter commented Oct 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants