MobileNetV2 is a lightweight deep learning model that is highly efficient and suitable for mobile and edge devices. It serves as the feature extractor in our architecture.
The encoder-decoder model consists of:
- Encoder: Takes the feature vectors produced by MobileNetV2 and processes them for further decoding.
- Decoder: Converts the encoded features into human-readable descriptions.
anything we want about them