Update README.md
Browse files
README.md
CHANGED
|
@@ -21,27 +21,22 @@ Most details about this model along with details can be found in our paper: [NNe
|
|
| 21 |
- [Model Card for Llama8b-NNetNav-WA](#model-card-for--model_id-)
|
| 22 |
- [Table of Contents](#table-of-contents)
|
| 23 |
- [Model Details](#model-details)
|
| 24 |
-
|
| 25 |
-
- [Uses](#uses)
|
| 26 |
- [Bias, Risks, and Limitations](#bias-risks-and-limitations)
|
| 27 |
- [Training Details](#training-details)
|
| 28 |
- [Training Data](#training-data)
|
| 29 |
- [Training Procedure](#training-procedure)
|
| 30 |
- [Environmental Impact](#environmental-impact)
|
| 31 |
-
- [Technical Specifications
|
| 32 |
-
- [
|
| 33 |
-
- [
|
| 34 |
-
- [Hardware](#hardware)
|
| 35 |
-
- [Software](#software)
|
| 36 |
-
- [Citation](#citation)
|
| 37 |
- [Model Card Authors [optional]](#model-card-authors-optional)
|
| 38 |
- [Model Card Contact](#model-card-contact)
|
| 39 |
- [How to Get Started with the Model](#how-to-get-started-with-the-model)
|
| 40 |
|
| 41 |
## Model Details
|
| 42 |
-
This model is intended to be used as a **web-agent** i.e. given an instruction such as
|
| 43 |
|
| 44 |
-
### Action Space
|
| 45 |
<!-- Provide a longer summary of what this model is/does. -->
|
| 46 |
The action space of the model is as follows:
|
| 47 |
```plaintext
|
|
@@ -88,25 +83,20 @@ TODO
|
|
| 88 |
|
| 89 |
### Training Data
|
| 90 |
|
| 91 |
-
This model was trained on the [NNetnav-WA](https://huggingface.co/datasets/stanfordnlp/nnetnav-wa) dataset, which is comprised of synthetic demonstrations entirely from self-hosted websites.
|
| 92 |
|
| 93 |
### Training Procedure
|
| 94 |
|
| 95 |
This model was trained for 2 epochs (roughly 4k gradient steps) with a batch size of 128, and a maximum sequence length of 20000.
|
| 96 |
|
| 97 |
-
##
|
| 98 |
|
| 99 |
- **Hardware Type:** 4 H100 GPUs (80G)
|
| 100 |
- **Hours used:** Roughly 2 days.
|
| 101 |
- **Cloud Provider:** Stanford compute.
|
| 102 |
- **Compute Region:** Stanford energy grid.
|
| 103 |
|
| 104 |
-
##
|
| 105 |
-
|
| 106 |
-
|
| 107 |
-
### Compute Infrastructure
|
| 108 |
-
|
| 109 |
-
This model was trained on a slurm cluster.
|
| 110 |
|
| 111 |
### Hardware
|
| 112 |
|
|
@@ -116,14 +106,6 @@ This model was trained on 4 H100s.
|
|
| 116 |
|
| 117 |
This model was fine-tuned with [Open-Instruct](https://github.com/allenai/open-instruct/tree/main)
|
| 118 |
|
| 119 |
-
## Citation
|
| 120 |
-
|
| 121 |
-
**BibTeX:**
|
| 122 |
-
|
| 123 |
-
```
|
| 124 |
-
|
| 125 |
-
```
|
| 126 |
-
|
| 127 |
|
| 128 |
## Model Card Authors [optional]
|
| 129 |
|
|
|
|
| 21 |
- [Model Card for Llama8b-NNetNav-WA](#model-card-for--model_id-)
|
| 22 |
- [Table of Contents](#table-of-contents)
|
| 23 |
- [Model Details](#model-details)
|
| 24 |
+
- [Results on Web-Agent Benchmarks](#results-on-benchmarks)
|
|
|
|
| 25 |
- [Bias, Risks, and Limitations](#bias-risks-and-limitations)
|
| 26 |
- [Training Details](#training-details)
|
| 27 |
- [Training Data](#training-data)
|
| 28 |
- [Training Procedure](#training-procedure)
|
| 29 |
- [Environmental Impact](#environmental-impact)
|
| 30 |
+
- [Technical Specifications](#technical-specifications)
|
| 31 |
+
- [Hardware](#hardware)
|
| 32 |
+
- [Software](#software)
|
|
|
|
|
|
|
|
|
|
| 33 |
- [Model Card Authors [optional]](#model-card-authors-optional)
|
| 34 |
- [Model Card Contact](#model-card-contact)
|
| 35 |
- [How to Get Started with the Model](#how-to-get-started-with-the-model)
|
| 36 |
|
| 37 |
## Model Details
|
| 38 |
+
This model is intended to be used as a **web-agent** i.e. given an instruction such as _Upvote the post by user smurty123 on subreddit r/LocalLLaMA_, and a web-url _reddit.com_, the model can perform the task by executing a sequence of actions.
|
| 39 |
|
|
|
|
| 40 |
<!-- Provide a longer summary of what this model is/does. -->
|
| 41 |
The action space of the model is as follows:
|
| 42 |
```plaintext
|
|
|
|
| 83 |
|
| 84 |
### Training Data
|
| 85 |
|
| 86 |
+
This model was trained with SFT on the [NNetnav-WA](https://huggingface.co/datasets/stanfordnlp/nnetnav-wa) dataset, which is comprised of synthetic demonstrations entirely from self-hosted websites.
|
| 87 |
|
| 88 |
### Training Procedure
|
| 89 |
|
| 90 |
This model was trained for 2 epochs (roughly 4k gradient steps) with a batch size of 128, and a maximum sequence length of 20000.
|
| 91 |
|
| 92 |
+
## Environmental Impact
|
| 93 |
|
| 94 |
- **Hardware Type:** 4 H100 GPUs (80G)
|
| 95 |
- **Hours used:** Roughly 2 days.
|
| 96 |
- **Cloud Provider:** Stanford compute.
|
| 97 |
- **Compute Region:** Stanford energy grid.
|
| 98 |
|
| 99 |
+
## Technical Specifications
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 100 |
|
| 101 |
### Hardware
|
| 102 |
|
|
|
|
| 106 |
|
| 107 |
This model was fine-tuned with [Open-Instruct](https://github.com/allenai/open-instruct/tree/main)
|
| 108 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 109 |
|
| 110 |
## Model Card Authors [optional]
|
| 111 |
|