Mask2Former: Semantic Segmentation
Mask2Former, proposed by Meta AI in 2022, is a unified framework for image segmentation (instance, semantic, and panoptic). It leverages a Transformer decoder with learnable "mask queries" to dynamically generate segmentation masks, eliminating dependency on anchors or proposals. The model integrates multi-scale feature enhancement, combining high-resolution details with deep semantics, and optimizes query-feature interaction via cross-attention. Achieving state-of-the-art results on COCO and ADE20K, Mask2Former excels in complex scenes and small-object segmentation. Its end-to-end architecture supports flexible deployment in autonomous driving, medical imaging, and remote sensing, advancing unified high-performance segmentation solutions.
Source model
- Input shape: 1x3x384x384
- Number of parameters: 42.01M
- Model size: 201M
- Output shape: [1x100x134],[1x100x96x96]
The source model can be found here
Performance Reference
Please search model by model name in Model Farm
Inference & Model Conversion
Please search model by model name in Model Farm
License
Source Model: APACHE-2.0
Deployable Model: APLUX-MODEL-FARM-LICENSE