qianhuiwu commited on
Commit
5cb0a88
Β·
verified Β·
1 Parent(s): 609d191

update model card.

Browse files
Files changed (1) hide show
  1. README.md +8 -6
README.md CHANGED
@@ -6,17 +6,19 @@ base_model:
6
 
7
  # GUI-Actor-7B with Qwen2-VL-7B as backbone VLM
8
 
9
- - [GUI-Actor-7B-Qwen2-VL](https://huggingface.co/microsoft/GUI-Actor-7B-Qwen2-VL)
10
- - [GUI-Actor-2B-Qwen2-VL](https://huggingface.co/microsoft/GUI-Actor-2B-Qwen2-VL)
11
- - [GUI-Actor-7B-Qwen2.5-VL (coming soon)](https://huggingface.co/microsoft/GUI-Actor-7B-Qwen2.5-VL)
12
- - [GUI-Actor-3B-Qwen2.5-VL (coming soon)](https://huggingface.co/microsoft/GUI-Actor-3B-Qwen2.5-VL)
13
- - [GUI-Actor-Verifier-2B](https://huggingface.co/microsoft/GUI-Actor-Verifier-2B)
14
-
15
  This model was introduced in the paper [**GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents**](https://aka.ms/GUI-Actor).
16
  It is developed based on [Qwen2-VL-7B-Instruct ](https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct), augmented by an attention-based action head and finetuned to perform GUI grounding using the dataset [here (coming soon)]().
17
 
18
  For more details on model design and evaluation, please check: [🏠 Project Page](https://aka.ms/GUI-Actor) | [πŸ’» Github Repo](https://github.com/microsoft/GUI-Actor) | [πŸ“‘ Paper]().
19
 
 
 
 
 
 
 
 
 
20
  ## πŸ“Š Performance Comparison on GUI Grounding Benchmarks
21
  Table 1. Main results on ScreenSpot-Pro, ScreenSpot, and ScreenSpot-v2 with **Qwen2-VL** as the backbone. † indicates scores obtained from our own evaluation of the official models on Huggingface.
22
  | Method | Backbone VLM | ScreenSpot-Pro | ScreenSpot | ScreenSpot-v2 |
 
6
 
7
  # GUI-Actor-7B with Qwen2-VL-7B as backbone VLM
8
 
 
 
 
 
 
 
9
  This model was introduced in the paper [**GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents**](https://aka.ms/GUI-Actor).
10
  It is developed based on [Qwen2-VL-7B-Instruct ](https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct), augmented by an attention-based action head and finetuned to perform GUI grounding using the dataset [here (coming soon)]().
11
 
12
  For more details on model design and evaluation, please check: [🏠 Project Page](https://aka.ms/GUI-Actor) | [πŸ’» Github Repo](https://github.com/microsoft/GUI-Actor) | [πŸ“‘ Paper]().
13
 
14
+ | Model Name | Hugging Face Link |
15
+ |--------------------------------------------|--------------------------------------------|
16
+ | **GUI-Actor-7B-Qwen2-VL** | [πŸ€— Hugging Face](https://huggingface.co/microsoft/GUI-Actor-7B-Qwen2-VL) |
17
+ | **GUI-Actor-2B-Qwen2-VL** | [πŸ€— Hugging Face](https://huggingface.co/microsoft/GUI-Actor-2B-Qwen2-VL) |
18
+ | **GUI-Actor-7B-Qwen2.5-VL (coming soon)** | [πŸ€— Hugging Face](https://huggingface.co/microsoft/GUI-Actor-7B-Qwen2.5-VL) |
19
+ | **GUI-Actor-3B-Qwen2.5-VL (coming soon)** | [πŸ€— Hugging Face](https://huggingface.co/microsoft/GUI-Actor-3B-Qwen2.5-VL) |
20
+ | **GUI-Actor-Verifier-2B** | [πŸ€— Hugging Face](https://huggingface.co/microsoft/GUI-Actor-Verifier-2B) |
21
+
22
  ## πŸ“Š Performance Comparison on GUI Grounding Benchmarks
23
  Table 1. Main results on ScreenSpot-Pro, ScreenSpot, and ScreenSpot-v2 with **Qwen2-VL** as the backbone. † indicates scores obtained from our own evaluation of the official models on Huggingface.
24
  | Method | Backbone VLM | ScreenSpot-Pro | ScreenSpot | ScreenSpot-v2 |