Releases · illuin-tech/colpali

15 Nov 18:37

ManuelFay

v0.3.13

174055b

v0.3.13: ModernVBert Latest

Latest

[0.3.13] - 2025-11-15

Added

Add ModernVBERT to the list of supported models

Fixed

Fix multi hard negatives training
Fix multi dataset sampling in order to weight probability of being picked by the size of the dataset

Changed

Bump transformer, torch and peft support

Assets 2

16 Jul 10:16

ManuelFay

v0.3.12

97e389a

v0.3.12

[0.3.12] - 2025-07-16

Added

Video processing for ColQwen-Omni

Fixed

Fixed loading of PaliGemma and ColPali checkpoints (bug introduced in transformers 4.52)
Fixed loading of SmolVLM (Idefics3) processors that didn't transmit image_seq_len (bug introduced in transformers 4.52)

Assets 2

04 Jul 16:23

ManuelFay

v0.3.11

0fcbe49

v0.3.11

[0.3.11] - 2025-07-04

Added

Added BiIdefics3 modeling and processor.
[Breaking] (minor) Remove support for context-augmented queries and images
Uniform processor docstring
Update the collator to align with the new function signatures
Add a process_text method to replace the process_query one. We keep support of the last one for the moment, but we'll deprecate it later
Introduce the ColPaliEngineDataset and Corpus class. This is to delegate all data loading to a standard format before training. The concept is for users to override the dataset class if needed for their specific usecases.
Added smooth_max option to loss functions
Added weighted in_batch terms for losses with hard negatives
Added an option to filter out (presumably) false negatives during online training
Added a training script in pure torch without the HF trainer
Added a sampler to train with multiple datasets at once, with each batch coming from the same source. (experimental, might still need testing on multi-GPU)
Adds score normalization to LI models (diving by token length) for betetr performance with CE loss
Add experimental PLAID support

Changed

Stops pooling queries between GPUs and instead pools only documents, enabling training with way bigger batch sizes. We recomment training with accelerate launch now.
Updated loss functions for better abstractions and coherence between the various loss functions. Small speedups and less memory requirements.

Assets 2

18 Apr 16:51

ManuelFay

v0.3.10

cae32d3

v0.3.10: minor updates & dependency bumps

[0.3.10] - 2025-04-18

Added

Add LambdaTokenPooler to allow for custom token pooling functions.
Added training losses with negatives to InfoNCE type losses

Changed

Fix similarity map helpers for ColQwen2 and ColQwen2.5.
[Breaking] (minor) Remove support for Idefics2-based models.
Disable multithreading in HierarchicalTokenPooler if num_workers is not provided or is 1.
[Breaking] (minor) Make pool_factor an argument of pool_embeddings instead of a HierarchicalTokenPooler class attribute
Bump dependencies for transformers, torch, peft, pillow, accelerate, etc...

Assets 2

03 Apr 16:01

ManuelFay

v0.3.9

5b1b912

v0.3.9

Added

Allow user to pass custom textual context for passage inference
Add ColQwen2.5 support and BiQwen2.5 support
Add support for token pooling with HierarchicalTokenPooler.
Allow user to specify the maximum number of image tokens in the resized images in ColQwen2Processor and ColQwen2_5_Processor.

Changed

Warn about evaluation being different from Vidore, and do not store results to prevent confusion.
Remove duplicate resize code in ColQwen2Processor and ColQwen2_5_Processor.
Simplify sequence padding for pixel values in ColQwen2Processor and ColQwen2_5_Processor.
Remove deprecated evaluation (CustomRetrievalEvaluator) from trainer
Refactor the collator classes
Make processor input compulsory in ColModelTrainingConfig
Make BaseVisualRetrieverProcessor inherit from ProcessorMixin
Remove unused tokenizer field from ColModelTrainingConfig
Bump transformers to 4.50.0 and torch to 2.6.0 to keep up with the latest versions. Note that this leads to errors on mps until transformers 4.50.4 is released.

Assets 2

29 Jan 09:17

tonywu71

v0.3.8

59e94a9

v0.3.8

Description

Fix dependencies in colpali-engine[train] and reorganize tests.

Features

Fixed

Fix peft version in colpali-engine[train]
Loosen upper bound for accelerate

Tests

Reorganize modeling tests
Add test for ColIdefics3 (and ColSmol)

Assets 2

28 Jan 14:29

tonywu71

v0.3.7

abe8fa0

v0.3.7

Description

Add support for colSmol-256M and colSmol-500M.

Features

Changed

Bump transformers to 4.47 to support colSmol-256M and colSmol-500M

Fixed

Fix checkpoints used for ColQwen2 tests

Assets 2

10 Jan 14:29

tonywu71

v0.3.6

0bf8c4d

v0.3.6

Description

Loosen default dependencies, but keep stricter dep ranges for the train dependency group.

Features

Added

Add expected scores in ColPali E2E test

Changed

Loosen package dependencies

Full Changelog: v0.3.5...v0.3.6

Assets 2

13 Dec 10:46

ManuelFay

v0.3.5

db3d00e

v0.3.5: SmolVM

[0.3.5] - 2024-12-13

Added

Added support for Idefics3 (and SmolVLM)

Fixed

Fix typing for processor.score_multi_vector (allow for both list and tensor inputs). This does not change how the scores are computed.
Fix tear_down_torch when used on a non-MPS machine

Assets 2

07 Nov 15:44

ManuelFay

v0.3.4

9bdcb8a

v0.3.4

[0.3.4] - 2024-11-07

Added

General CorpusQueryCollator for BEIR style dataset training or hard negative training. This deprecates HardNegCollator but all changes to the training loop are made for a seemless update.

Changed

Updates BiPali config files
Removed query augmentation tokens from BiQwen2Processor
Modified XQwen2Processor to place <|endoftext|> token at the end of the document prompt (non-breaking for ColQwen but helps BiQwen).
Removed add_suffix in the VisualRetrieverCollator and let the suffix be added in the individual processors.
Changed the incorrect <pad> token to <|endoftext|> fo query augmentation ColQwen2Processor. Note that previous models were trained with <|endoftext|> so this is simply a non-breaking inference upgrade patch.

Assets 2

Releases: illuin-tech/colpali

v0.3.13: ModernVBert

[0.3.13] - 2025-11-15

Added

Fixed

Changed

Uh oh!

v0.3.12

[0.3.12] - 2025-07-16

Added

Fixed

Uh oh!

v0.3.11

[0.3.11] - 2025-07-04

Added

Changed

Uh oh!

v0.3.10: minor updates & dependency bumps

[0.3.10] - 2025-04-18

Added

Changed

Uh oh!

v0.3.9

Added

Changed

Uh oh!

v0.3.8

Description

Features

Fixed

Tests

Uh oh!

v0.3.7

Description

Features

Changed

Fixed

Uh oh!

v0.3.6

Description

Features

Added

Changed

Uh oh!

v0.3.5: SmolVM

[0.3.5] - 2024-12-13

Added

Fixed

Uh oh!

v0.3.4

[0.3.4] - 2024-11-07

Added

Changed

Uh oh!