microsoft
/

GUI-Actor-7B-Qwen2-VL

@@ -6,17 +6,19 @@ base_model:
 # GUI-Actor-7B with Qwen2-VL-7B as backbone VLM
-- [GUI-Actor-7B-Qwen2-VL](https://huggingface.co/microsoft/GUI-Actor-7B-Qwen2-VL)
-- [GUI-Actor-2B-Qwen2-VL](https://huggingface.co/microsoft/GUI-Actor-2B-Qwen2-VL)
-- [GUI-Actor-7B-Qwen2.5-VL (coming soon)](https://huggingface.co/microsoft/GUI-Actor-7B-Qwen2.5-VL)
-- [GUI-Actor-3B-Qwen2.5-VL (coming soon)](https://huggingface.co/microsoft/GUI-Actor-3B-Qwen2.5-VL)
-- [GUI-Actor-Verifier-2B](https://huggingface.co/microsoft/GUI-Actor-Verifier-2B)
 This model was introduced in the paper [**GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents**](https://aka.ms/GUI-Actor).
 It is developed based on [Qwen2-VL-7B-Instruct ](https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct), augmented by an attention-based action head and finetuned to perform GUI grounding using the dataset [here (coming soon)]().
 For more details on model design and evaluation, please check: [🏠 Project Page](https://aka.ms/GUI-Actor) | [💻 Github Repo](https://github.com/microsoft/GUI-Actor) | [📑 Paper]().
 ## 📊 Performance Comparison on GUI Grounding Benchmarks
 Table 1. Main results on ScreenSpot-Pro, ScreenSpot, and ScreenSpot-v2 with **Qwen2-VL** as the backbone. † indicates scores obtained from our own evaluation of the official models on Huggingface.
 | Method           | Backbone VLM | ScreenSpot-Pro | ScreenSpot | ScreenSpot-v2 |

 # GUI-Actor-7B with Qwen2-VL-7B as backbone VLM
 This model was introduced in the paper [**GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents**](https://aka.ms/GUI-Actor).
 It is developed based on [Qwen2-VL-7B-Instruct ](https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct), augmented by an attention-based action head and finetuned to perform GUI grounding using the dataset [here (coming soon)]().
 For more details on model design and evaluation, please check: [🏠 Project Page](https://aka.ms/GUI-Actor) | [💻 Github Repo](https://github.com/microsoft/GUI-Actor) | [📑 Paper]().
+| Model Name                                  | Hugging Face Link                         |
+|--------------------------------------------|--------------------------------------------|
+| **GUI-Actor-7B-Qwen2-VL**                   | [🤗 Hugging Face](https://huggingface.co/microsoft/GUI-Actor-7B-Qwen2-VL)         |
+| **GUI-Actor-2B-Qwen2-VL**                   | [🤗 Hugging Face](https://huggingface.co/microsoft/GUI-Actor-2B-Qwen2-VL)         |
+| **GUI-Actor-7B-Qwen2.5-VL (coming soon)**   | [🤗 Hugging Face](https://huggingface.co/microsoft/GUI-Actor-7B-Qwen2.5-VL)       |
+| **GUI-Actor-3B-Qwen2.5-VL (coming soon)**   | [🤗 Hugging Face](https://huggingface.co/microsoft/GUI-Actor-3B-Qwen2.5-VL)       |
+| **GUI-Actor-Verifier-2B**                   | [🤗 Hugging Face](https://huggingface.co/microsoft/GUI-Actor-Verifier-2B)        |
 ## 📊 Performance Comparison on GUI Grounding Benchmarks
 Table 1. Main results on ScreenSpot-Pro, ScreenSpot, and ScreenSpot-v2 with **Qwen2-VL** as the backbone. † indicates scores obtained from our own evaluation of the official models on Huggingface.
 | Method           | Backbone VLM | ScreenSpot-Pro | ScreenSpot | ScreenSpot-v2 |