|
--- |
|
license: apache-2.0 |
|
tags: |
|
- vision |
|
- tracking |
|
--- |
|
|
|
# TAPNet |
|
|
|
This repository contains the checkpoints of several point tracking models developed by DeepMind for point tracking. |
|
|
|
π **Code**: [https://github.com/google-deepmind/tapnet](https://github.com/google-deepmind/tapnet) |
|
|
|
## Included Models |
|
|
|
[**TAPIR**](https://deepmind-tapir.github.io/) β A fast and accurate point tracker for continuous point trajectories in space-time, presented in the paper [**TAPIR: Tracking Any Point with per-frame Initialization and temporal Refinement**](https://huggingface.co/papers/2306.08637). |
|
|
|
[**BootsTAPIR**](https://bootstap.github.io/) β A bootstrapped variant of TAPIR that improves robustness and stability across long videos via self-supervised refinement, presented in the paper [**BootsTAP: Bootstrapped Training for Tracking-Any-Point**](https://huggingface.co/papers/2402.00847). |
|
|
|
[**TAPNext**](https://tap-next.github.io/) β A new generative approach that frames point tracking as next-token prediction, enabling semi-dense, accurate, and temporally coherent tracking across challenging videos, presented in the paper [**TAPNext: Tracking Any Point (TAP) as Next Token Prediction**](https://huggingface.co/papers/2504.05579). |
|
|
|
These models provide state-of-the-art performance for tracking arbitrary points in videos and support research and applications in robotics, perception, and video generation. |
|
|