vision
tracking
tapnet / README.md
yangyi02's picture
Update README.md
45ac0ce verified
metadata
license: apache-2.0
tags:
  - vision
  - tracking

TAPNet

This repository contains the checkpoints of several point tracking models developed by DeepMind for point tracking.

๐Ÿ”— Code: https://github.com/google-deepmind/tapnet

Included Models

TAPIR โ€“ A fast and accurate point tracker for continuous point trajectories in space-time, presented in the paper TAPIR: Tracking Any Point with per-frame Initialization and temporal Refinement.

BootsTAPIR โ€“ A bootstrapped variant of TAPIR that improves robustness and stability across long videos via self-supervised refinement, presented in the paper BootsTAP: Bootstrapped Training for Tracking-Any-Point.

TAPNext โ€“ A new generative approach that frames point tracking as next-token prediction, enabling semi-dense, accurate, and temporally coherent tracking across challenging videos, presented in the paper TAPNext: Tracking Any Point (TAP) as Next Token Prediction.

These models provide state-of-the-art performance for tracking arbitrary points in videos and support research and applications in robotics, perception, and video generation.