z-retina
Multi-label fundus (retinal) image classification: MetaFormer CAFormer-B36 backbone + ML-Decoder head, trained with asymmetric loss and patient-level splits (see the code repository for full training and Table S1 evaluation).
Source code: github.com/ATMZR/z-retina
Files in this repo
| File | Description |
|---|---|
best.pth |
PyTorch checkpoint (best validation metric during training). |
config.json |
num_labels, class_names, id2label, label2id — logit index i matches class_names[i]. |
Label semantics
There are 15 pathology logits. Normal fundus is represented in the training codebase as an all-zero target vector (no positive class). The strings in config.json must stay in this order for metrics and thresholding to match the reference implementation.
How to load weights (with the z-retina codebase)
Clone and install from GitHub, then load the checkpoint using the project’s utilities (see src/z_retina/model.py — load_caformer_checkpoint expects either a raw state_dict or a dict containing a "model" key).
Example evaluation entrypoint from the README:
poetry run z-retina-eval --config configs/default.yaml --checkpoint path/to/best.pth --table_s1
Point --checkpoint at a local copy of best.pth downloaded from this Hub repo.
Model I/O
- Input: RGB, 224×224, ImageNet normalization (see
configs/default.yamlandEyeDatasetin the code repo). - Output: 15 logits (multi-label); apply sigmoid per logit for probabilities.
Citation
If you use this model, please cite or link the z-retina repository and respect third-party licences (e.g. vendored ML-Decoder under third_party/ml_decoder in that repo).
- Downloads last month
- 19