CovisPose Model
This model estimates relative poses between panoramic images using the CovisPose framework.
Model Details
- Architecture: CovisPose with resnet50 backbone
- Transformer Layers: 6
- FFN Dimension: 2048
- Input Size: [512, 1024]
- Parameters: 121,890,467 (estimated)
Training Information
Configuration
- Epochs: N/A
- Batch Size: N/A
- Learning Rate: N/A
- Backbone: N/A
Performance Metrics
- Final Training Loss: N/A
- Training Rotation Error: N/A
- Final Validation Loss: N/A
- Validation Rotation Error: N/A
Usage
import torch
import json
from huggingface_hub import hf_hub_download
# Download model files
model_path = hf_hub_download(
repo_id="SGEthan/covis_toy",
filename="pytorch_model.bin"
)
config_path = hf_hub_download(
repo_id="SGEthan/covis_toy",
filename="config.json"
)
# Load configuration
with open(config_path, 'r') as f:
config = json.load(f)
# Initialize model (you'll need the COVIS class)
from models.covispose_model import COVIS
model = COVIS(
backbone=config['backbone'],
num_transformer_layers=config['num_transformer_layers'],
transformer_ffn_dim=config['transformer_ffn_dim']
)
# Load weights
checkpoint = torch.load(model_path, map_location='cpu')
if "model_state_dict" in checkpoint:
model.load_state_dict(checkpoint["model_state_dict"])
else:
model.load_state_dict(checkpoint)
model.eval()
# Use for inference
with torch.no_grad():
# Your inference code here
# outputs1, outputs2 = model(pano1_tensor, pano2_tensor)
pass
Model Architecture
The CovisPose model consists of:
- Backbone Network: resnet50 for feature extraction
- Transformer Encoder: 6 layers for processing image features
- Prediction Heads:
- Covisibility mask prediction
- Relative pose estimation
- Boundary detection
Task Description
CovisPose estimates the relative pose between two panoramic images by:
- Covisibility Estimation: Predicting which parts of the images overlap
- Pose Regression: Estimating relative rotation and translation
- Boundary Detection: Finding floor-wall boundaries for scale estimation
Training Data
This model was trained on panoramic image pairs with:
- Relative pose annotations
- Covisibility masks
- Floor-wall boundary labels
Limitations
- Designed specifically for indoor panoramic images
- Requires significant visual overlap between image pairs for reliable pose estimation
- Performance may degrade on outdoor scenes or images with minimal overlap
Citation
If you use this model, please cite the CovisPose work:
@article{covispose2024,
title={CovisPose: Co-visibility Pose Estimation for Panoramic Images},
author={Your Authors},
journal={Conference/Journal},
year={2024}
}
License
This model is released under the MIT License.
Repository
- Training Code: Available in the original repository
- Model Upload: Generated automatically from local checkpoint
Model uploaded on 2025-06-27T09:05:36.254713 using upload_model.py
- Downloads last month
- 5
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support