Hierarchical Visual Prompt Learning for Continual Video Instance Segmentation (ICCV 2025)

Jiahua Dong^*, Hui Yin^*, Wenqi Liang, Hanbin Zhao, Henghui Ding, Nicu Sebe, Salman Khan, Fahad Shahbaz Khan (*equal contribution)

[arXiv]

Introduction

Video instance segmentation (VIS) has gained significant attention for its capability in segmenting and tracking object instances across video frames. However, most of the existing VIS methods unrealistically assume that the categories of object instances remain fixed over time. Moreover, they expe rience catastrophic forgetting of old classes when required to continuously learn object instances belonging to new classes. To address the above challenges, we develop a novel Hierarchical Visual Prompt Learning (HVPL) model, which alleviates catastrophic forgetting of old classes from both frame-level and video-level perspectives. Specifically, to mit igate forgetting at the frame level, we devise a task-specific frame prompt and an orthogonal gradient correction (OGC) module. The OGC module helps the frame prompt encode task-specific global instance information for new classes in each individual frame by projecting its gradients onto the orthogonal feature space of old classes. Furthermore, to ad dress forgetting at the video level, we design a task-specific video prompt and a video context decoder. This decoder first embeds structural inter-class relationships across frames into the frame prompt features, and then propagates task specific global video contexts from the frame prompt features to the video prompt. The effectiveness of our HVPL model is demonstrated through extensive experiments, in which it outperforms baseline methods.

Updates

Aug 21, 2025: 🎉🎉🎉 Code and pretrained weights are now available! Thanks for your patience :)
Jun 26, 2025: 🎉🎉🎉 HVPL is accepted to ICCV 2025!

Installation

See installation instructions.

Getting Started

We provide a script train_net_hvpl.py, that is made to train all the configs provided in HVPL.

To train a model with "train_net_hvpl.py" on VIS, first setup the corresponding datasets following Preparing Datasets for HVPL.

Train

Step t=0: Training the model for base classes (you can skip this process if you use pre-trained weights.)
Step t>1: Training the model for novel classes with HVPL

Scenario	Script
YouTubeVIS 2019 20-2	`bash youtube_2019_20_2.sh`
YouTubeVIS 2019 20-5	`bash youtube_2019_20_5.sh`

YouTubeVIS 2021 20-4	`bash yvis_2021_20_4.sh`
YouTubeVIS 2021 30-10	`bash yvis_2021_30_10.sh`

OVIS 15-5	`bash OVIS_15_5.sh`
OVIS 15-10	`bash OVIS_15_10.sh`

Eval

Note that during training, testing will be performed, generating a corresponding result.json file as well as a .txt file for saving the evaluation metrics (AP、AP₅₀、AP₇₅、AR₁).

To validate the forgetting (F) metric (FAP、FAR₁)., please run dataset_eval.py.

Citing HVPL

If you use HVPL in your research or wish to refer to the baseline results published in the Model Zoo, please use the following BibTeX entry.

@InProceedings{Dong2025HVPL,
      title={Hierarchical Visual Prompt Learning for Continual Video Instance Segmentation}, 
      author={Jiahua Dong and Hui Yin and Wenqi Liang and Hanbin Zhao and Henghui Ding and Nicu Sebe and Salman Khan and Fahad Shahbaz Khan},
      year={2025},
      booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
      month={October},
}

Acknowledgement

Our code is largely based on VITA, ECLIPSE, Detectron2, Mask2Former, and Deformable DETR. We are truly grateful for their excellent work.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
Figs		Figs
datasets		datasets
INSTALL.md		INSTALL.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hierarchical Visual Prompt Learning for Continual Video Instance Segmentation (ICCV 2025)

Introduction

Updates

Installation

Getting Started

Train

Eval

Citing HVPL

Acknowledgement

About

Uh oh!

Releases

Packages

JiahuaDong/HVPL

Folders and files

Latest commit

History

Repository files navigation

Hierarchical Visual Prompt Learning for Continual Video Instance Segmentation (ICCV 2025)

Introduction

Updates

Installation

Getting Started

Train

Eval

Citing HVPL

Acknowledgement

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages