Skip to content

Commit b00544b

Browse files
authored
Oops, forgot FlexiViT's readme, sorry! (#25)
1 parent 26c7bc8 commit b00544b

File tree

1 file changed

+64
-0
lines changed

1 file changed

+64
-0
lines changed
Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
# FlexiViT: One Model for All Patch Sizes
2+
*by Lucas Beyer, Pavel Izmailov, Alexander Kolesnikov, Mathilde Caron, Simon Kornblith, Xiaohua Zhai, Matthias Minderer, Michael Tschannen, Ibrahim Alabdulmohsin, Filip Pavetic*
3+
4+
## Introduction
5+
We publish all pre-trained FlexiViT models, and configurations for training
6+
those, as well as training logs for one run.
7+
8+
Please read the main [big_vision README](/README.md) to learn how to run
9+
configs, and remember that each config file contains an example invocation in
10+
the top-level comment.
11+
12+
## Pre-trained paper models
13+
14+
Here are the models that we used as backbones in the paper. See Tables in the
15+
appendix of the paper for expected scores at various patch-sizes and on various
16+
datasets.
17+
18+
First, the recommended models we used for all experiments.
19+
Remember that the input is 240px, not 224px:
20+
21+
| Dataset | Model | Download link | Notes |
22+
| :--- | :---: | :---: | :---: |
23+
| ImageNet-1k | FlexiViT-L | [link](https://storage.googleapis.com/big_vision/flexivit/flexivit_l_i1k.npz) | 1200ep version |
24+
| ImageNet-1k | FlexiViT-B | [link](https://storage.googleapis.com/big_vision/flexivit/flexivit_b_i1k.npz) | 1200ep version |
25+
| ImageNet-1k | FlexiViT-S | [link](https://storage.googleapis.com/big_vision/flexivit/flexivit_s_i1k.npz) | 1200ep version |
26+
| ImageNet-21k | FlexiViT-B | [link](https://storage.googleapis.com/big_vision/flexivit/flexivit_b_i21k_300ep.npz) | 300ep version. 1000ep version below is better but was not used in the paper for fair comparison to baselines. |
27+
| ImageNet-21k | ViT-B/16 | [link](https://storage.googleapis.com/big_vision/flexivit/vit_b16_i21k_300ep.npz) | Apples-to-apples non-flexi baseline used throughout the paper. |
28+
| ImageNet-21k | ViT-B/30 | [link](https://storage.googleapis.com/big_vision/flexivit/vit_b30_i21k_300ep.npz) | Apples-to-apples non-flexi baseline used throughout the paper. |
29+
30+
These models can be used directly in our codebase by specifying
31+
`model_name = "proj.flexi.vit"` and `model_init = "FlexiViT-L i1k"` for example.
32+
See the file `models/proj/flexi/vit.py` for more names.
33+
34+
*Important detail:* When further re-using these models with a flexible patch
35+
size, it is recommended to keep the patch-embedding parameter buffer at its
36+
original size, and change patch-size on the fly using pi-resize, as opposed to
37+
changing the parameter buffer's size at load-time.
38+
For re-using the models with a fixed patch size, either way is fine.
39+
(The reason is that it is impossible to chain multiple resizes without loss,
40+
eg doing 32->8->32 does not result in the original weights.)
41+
42+
Second, the list of all released models for completeness:
43+
44+
| Dataset | Model | Download link | Notes |
45+
| :--- | :---: | :---: | :---: |
46+
| ImageNet-21k | FlexiViT-B | [link](https://storage.googleapis.com/big_vision/flexivit/flexivit_b_i21k_1000ep.npz) | 1000ep version. Should be the best available -B model. |
47+
| ImageNet-21k | FlexiViT-B | [link](https://storage.googleapis.com/big_vision/flexivit/flexivit_b_i21k_90ep.npz) | 90ep version |
48+
| ImageNet-1k | FlexiViT-L | [link](https://storage.googleapis.com/big_vision/flexivit/flexivit_l_i1k_600ep.npz) | 600ep version |
49+
| ImageNet-1k | FlexiViT-L | [link](https://storage.googleapis.com/big_vision/flexivit/flexivit_l_i1k_300ep.npz) | 300ep version |
50+
| ImageNet-1k | FlexiViT-L | [link](https://storage.googleapis.com/big_vision/flexivit/flexivit_l_i1k_90ep.npz) | 90ep version |
51+
| ImageNet-1k | FlexiViT-B | [link](https://storage.googleapis.com/big_vision/flexivit/flexivit_b_i1k_600ep.npz) | 600ep version |
52+
| ImageNet-1k | FlexiViT-B | [link](https://storage.googleapis.com/big_vision/flexivit/flexivit_b_i1k_300ep.npz) | 300ep version |
53+
| ImageNet-1k | FlexiViT-B | [link](https://storage.googleapis.com/big_vision/flexivit/flexivit_b_i1k_90ep.npz) | 90ep version |
54+
| ImageNet-1k | FlexiViT-S | [link](https://storage.googleapis.com/big_vision/flexivit/flexivit_s_i1k_600ep.npz) | 600ep version |
55+
| ImageNet-1k | FlexiViT-S | [link](https://storage.googleapis.com/big_vision/flexivit/flexivit_s_i1k_300ep.npz) | 300ep version |
56+
| ImageNet-1k | FlexiViT-S | [link](https://storage.googleapis.com/big_vision/flexivit/flexivit_s_i1k_90ep.npz) | 90ep version |
57+
58+
## Results
59+
60+
We provide full training logs for a run with this public code on Cloud that
61+
reproduces the FlexiViT-S 90ep on i1k results:
62+
- [metrics](https://storage.googleapis.com/big_vision/flexivit/deit3_i1k_s_90ep_12-15_2254/big_vision_metrics.txt)
63+
- [config](https://storage.googleapis.com/big_vision/flexivit/deit3_i1k_s_90ep_12-15_2254/config.json)
64+
- or `gs://big_vision/flexivit/deit3_i1k_s_90ep_12-15_2254`.

0 commit comments

Comments
 (0)