File size: 6,375 Bytes
5925e2a 1a77751 39e24d1 5925e2a 1a77751 5925e2a 1a77751 67be8af 5925e2a 1a77751 5925e2a 1a77751 5925e2a 1a77751 5925e2a 1a77751 5925e2a 39e24d1 9973311 39e24d1 9973311 39e24d1 e7219b7 279463b c770f1e 279463b c770f1e 279463b 39e24d1 9973311 39e24d1 f6047fc |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 |
---
library_name: transformers
tags:
- text-to-sql
- sql
license: apache-2.0
language:
- en
base_model:
- Qwen/Qwen2.5-Coder-7B-Instruct
datasets:
- AlioLeuchtmann/BIRD_SPIDER_Qwen2.5-Coder-32B
---
BIRD-bench: https://bird-bench.github.io/
## 57% on BIRD private test Set (Qwen 2.5 Coder Models might have been trained on BIRD dev, which could explain the Performance Drop)
## 60% on BIRD dev one shot
## 70% on BIRD dev 4 shot
Offical Results on BIRD Test Set
| Metric | Simple | Moderate | Challenging | Total |
| ----------- | -----: | -------: | ----------: | ----: |
| **Count** | 949 | 555 | 285 | 1789 |
| **EX** | 65.33 | 53.33 | 39.65 | 57.52 |
| **Soft F1** | 66.31 | 55.51 | 41.48 | 59.01 |
| **R-VES** | 61.15 | 47.73 | 36.71 | 53.09 |
## Model Details
SFT of Qwen2.5-Coder-7B <br>
Dataset is a combination of BIRD and Spider; answers were created through knowledge distillation with CoT prompting from Qwen2.5-Coder-32B, and filtered for correctness. <br>
<br>
We increased the BIRD dev set performance from 51.1% to 60%. <br>
Our model beats much larger universal models like GPT-4o and Gemini 1.5 Pro, demonstrating the effectiveness of black-box knowledge distillation for constrained domains. <br>
After training, this checkpoint even exceeds the performance of its teacher model, Qwen2.5-Coder-32B, which can be achieved by filtering the teacher’s outputs for Quality <br>

## Prompt:
Single Prompt used throughout the Dataset so best used as follows:
```sql
CREATE TABLE geographic\n(\n city TEXT not null\n primary key,\n county TEXT null,\n region TEXT null\n) \n /* \n 2 example rows: \n SELECT * FROM geographic LIMIT 2; \n city county region \nalameda alameda county bay area \n alamo contra costa county bay area \n */\n\n <br>
CREATE TABLE generalinfo\n(\n id_restaurant INTEGER not null\n primary key,\n label TEXT null,\n food_type TEXT null,\n city TEXT null,\n review REAL null,\n foreign key (city) references geographic(city)\n on update cascade on delete cascade\n) \n /* \n 2 example rows: \n SELECT * FROM generalinfo LIMIT 2; \n id_restaurant label food_type city review \n 1 sparky's diner 24 hour diner san francisco 2.3 \n 2 kabul afghan cuisine afghani san carlos 3.8 \n */\n\nCREATE TABLE location\n(\n id_restaurant INTEGER not null\n primary key,\n street_num INTEGER null,\n street_name TEXT null,\n city TEXT null,\n foreign key (city) references geographic (city)\n on update cascade on delete cascade,\n foreign key (id_restaurant) references generalinfo (id_restaurant)\n on update cascade on delete cascade\n) \n /* \n 2 example rows: \n SELECT * FROM location LIMIT 2; \n id_restaurant street_num street_name city \n 1 242 church st san francisco \n 2 135 el camino real san carlos \n */\n\n <br>
-- External Knowledge: Atlantic Ave refers to street_name = 'atlantic ave'; rating refers to review\n <br>
-- Using valid SQLite and understanding External Knowledge, answer the following question for the tables provided above.\n <br>
-- What is the rating of each restaurant reviews on Atlantic Ave?\n <br>
Generate the SQL after thinking step by step:\n <br>
```
```python
def bird_gpt_template_no_format(question, commonsense, schema):
return f"""{schema}
-- External Knowledge: {commonsense}
-- Using valid SQLite and understanding External Knowledge, answer the following question for the tables provided above.
-- {question}
Generate the SQL after thinking step by step:
"""
```
### Generate Schema:
```python
def generate_schema_prompt(db_path, num_rows=None):
# extract create ddls
'''
:param root_place:
:param db_name:
:return:
'''
full_schema_prompt_list = []
conn = sqlite3.connect(db_path)
# Create a cursor object
cursor = conn.cursor()
cursor.execute("SELECT name FROM sqlite_master WHERE type='table'")
tables = cursor.fetchall()
schemas = {}
for table in tables:
if table == 'sqlite_sequence':
continue
cursor.execute("SELECT sql FROM sqlite_master WHERE type='table' AND name='{}';".format(table[0]))
create_prompt = cursor.fetchone()[0]
schemas[table[0]] = create_prompt
if num_rows:
cur_table = table[0]
if cur_table in ['order', 'by', 'group','transaction'] or ' ' in str(cur_table).strip() or '-' in str(cur_table).strip():
cur_table = '"{}"'.format(cur_table)
cursor.execute("SELECT * FROM {} LIMIT {}".format(cur_table, num_rows))
column_names = [description[0] for description in cursor.description]
values = cursor.fetchall()
rows_prompt = nice_look_table(column_names=column_names, values=values)
verbose_prompt = "/* \n {} example rows: \n SELECT * FROM {} LIMIT {}; \n {} \n */".format(num_rows,
cur_table,
num_rows,
rows_prompt)
schemas[table[0]] = "{} \n {}".format(create_prompt, verbose_prompt)
for k, v in schemas.items():
full_schema_prompt_list.append(v)
schema_prompt = "\n\n".join(full_schema_prompt_list)
return schema_prompt
```
### System Prompt:
Default Qwen2.5 System Prompt
```python
def preprocess_prompt(prompt):
return f'''<|im_start|>system
You are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
'''
```
### Generation Config:
- No Sampling for best Performance
### Restrictions:
- only trained on SQLite Dialect
- only trained in English
- Did not care to keep any other Skills than Text to SQL
|