metadata

license: apache-2.0
tags:
  - vision
  - tracking

TAPNet

This repository contains the checkpoints of several point tracking models developed by DeepMind for point tracking.

🔗 Code: https://github.com/google-deepmind/tapnet

Included Models

TAPIR – A fast and accurate point tracker for continuous point trajectories in space-time, presented in the paper TAPIR: Tracking Any Point with per-frame Initialization and temporal Refinement.

BootsTAPIR – A bootstrapped variant of TAPIR that improves robustness and stability across long videos via self-supervised refinement, presented in the paper BootsTAP: Bootstrapped Training for Tracking-Any-Point.

TAPNext – A new generative approach that frames point tracking as next-token prediction, enabling semi-dense, accurate, and temporally coherent tracking across challenging videos, presented in the paper TAPNext: Tracking Any Point (TAP) as Next Token Prediction.

These models provide state-of-the-art performance for tracking arbitrary points in videos and support research and applications in robotics, perception, and video generation.