Skip to content

[GSoC 2025] Apply quantization to reduce model footprint #17

@singjc

Description

@singjc

Background

Our diffusion model for DIA-MS data deconvolution currently has a large memory footprint, which limits our ability to process larger datasets and deploy the model efficiently. Quantization techniques, which have been successfully applied to vision models, offer a promising approach to reduce model size while maintaining performance.

Task Objectives

  • Implement post-training quantization for our existing diffusion model
  • Evaluate multiple quantization strategies (e.g., INT8, FP16) for their impact on:
    • Memory usage
    • Inference speed
    • Model accuracy/performance
  • Implement the most effective quantization strategy in our training and inference pipelines

Technical Details

  • The model is implemented in PyTorch with a U-Net-based architecture
  • We need to balance memory efficiency with the precision required for accurate MS signal deconvolution
  • Begin with static quantization of pre-trained models, then explore quantization-aware training if results are promising

Deliverables

  • Implementation of quantization methods in the model code
  • Comprehensive benchmarks comparing original vs. quantized models including:
    • Memory usage measurements
    • Inference time comparisons
    • Quality metrics on test datasets
  • Documentation of the quantization process and best practices
  • Pull request with the optimized model implementation

Resources

Difficulty

Intermediate - Requires understanding of both the model architecture and quantization techniques.

Metadata

Metadata

Assignees

No one assigned

    Labels

    GSoC 2025Tasks specific for GSoC2025enhancementNew feature or requesthelp wantedExtra attention is needed

    Type

    No fields configured for Task.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions