vision
tracking
File size: 1,420 Bytes
2f21713
 
 
 
04e0fd5
 
 
 
 
c9f60a6
04e0fd5
c9f60a6
04e0fd5
c9f60a6
 
45ac0ce
5a62c1b
45ac0ce
c9f60a6
45ac0ce
c9f60a6
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
---
license: apache-2.0
tags:
- vision
- tracking
---

# TAPNet

This repository contains the checkpoints of several point tracking models developed by DeepMind for point tracking.

๐Ÿ”— **Code**: [https://github.com/google-deepmind/tapnet](https://github.com/google-deepmind/tapnet)  

## Included Models

[**TAPIR**](https://deepmind-tapir.github.io/) โ€“ A fast and accurate point tracker for continuous point trajectories in space-time, presented in the paper [**TAPIR: Tracking Any Point with per-frame Initialization and temporal Refinement**](https://huggingface.co/papers/2306.08637).

[**BootsTAPIR**](https://bootstap.github.io/) โ€“ A bootstrapped variant of TAPIR that improves robustness and stability across long videos via self-supervised refinement, presented in the paper [**BootsTAP: Bootstrapped Training for Tracking-Any-Point**](https://huggingface.co/papers/2402.00847).
  
[**TAPNext**](https://tap-next.github.io/) โ€“ A new generative approach that frames point tracking as next-token prediction, enabling semi-dense, accurate, and temporally coherent tracking across challenging videos, presented in the paper [**TAPNext: Tracking Any Point (TAP) as Next Token Prediction**](https://huggingface.co/papers/2504.05579).

These models provide state-of-the-art performance for tracking arbitrary points in videos and support research and applications in robotics, perception, and video generation.