File size: 6,375 Bytes
5925e2a
 
1a77751
 
 
 
 
 
 
 
39e24d1
 
5925e2a
 
1a77751
5925e2a
1a77751
67be8af
 
5925e2a
1a77751
5925e2a
1a77751
 
 
 
 
 
5925e2a
 
 
1a77751
5925e2a
1a77751
 
 
 
 
 
5925e2a
 
 
39e24d1
 
 
 
 
9973311
 
39e24d1
 
 
 
 
9973311
 
39e24d1
 
e7219b7
279463b
c770f1e
279463b
 
 
 
 
c770f1e
 
279463b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39e24d1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9973311
 
 
39e24d1
f6047fc
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
---
library_name: transformers
tags:
- text-to-sql
- sql
license: apache-2.0
language:
- en
base_model:
- Qwen/Qwen2.5-Coder-7B-Instruct
datasets:
- AlioLeuchtmann/BIRD_SPIDER_Qwen2.5-Coder-32B
---

BIRD-bench: https://bird-bench.github.io/

## 57% on BIRD private test Set (Qwen 2.5 Coder Models might have been trained on BIRD dev, which could explain the Performance Drop)
## 60% on BIRD dev one shot
## 70% on BIRD dev 4 shot

Offical Results on BIRD Test Set

| Metric      | Simple | Moderate | Challenging | Total |
| ----------- | -----: | -------: | ----------: | ----: |
| **Count**   |    949 |      555 |         285 |  1789 |
| **EX**      |  65.33 |    53.33 |       39.65 | 57.52 |
| **Soft F1** |  66.31 |    55.51 |       41.48 | 59.01 |
| **R-VES**   |  61.15 |    47.73 |       36.71 | 53.09 |



## Model Details

SFT of Qwen2.5-Coder-7B <br>
Dataset is a combination of BIRD and Spider; answers were created through knowledge distillation with CoT prompting from Qwen2.5-Coder-32B, and filtered for correctness. <br>
<br>
We increased the BIRD dev set performance from 51.1% to 60%. <br>
Our model beats much larger universal models like GPT-4o and Gemini 1.5 Pro, demonstrating the effectiveness of black-box knowledge distillation for constrained domains. <br>
After training, this checkpoint even exceeds the performance of its teacher model, Qwen2.5-Coder-32B, which can be achieved by filtering the teacher’s outputs for Quality <br>



![image/png](https://cdn-uploads.huggingface.co/production/uploads/65e01c279c88ffe889d64a73/pPwqvkFLB4VRKm6XBcZcn.png)


## Prompt:

Single Prompt used throughout the Dataset so best used as follows:

```sql
CREATE TABLE geographic\n(\n    city   TEXT not null\n        primary key,\n    county TEXT null,\n    region TEXT null\n) \n /* \n 2 example rows: \n SELECT * FROM geographic LIMIT 2; \n    city              county   region \nalameda      alameda county bay area \n  alamo contra costa county bay area  \n */\n\n <br>
CREATE TABLE generalinfo\n(\n    id_restaurant INTEGER         not null\n        primary key,\n    label         TEXT  null,\n    food_type     TEXT  null,\n    city          TEXT  null,\n    review        REAL null,\n    foreign key (city) references geographic(city)\n            on update cascade on delete cascade\n) \n /* \n 2 example rows: \n SELECT * FROM generalinfo LIMIT 2; \n id_restaurant                label     food_type          city review \n            1       sparky's diner 24 hour diner san francisco    2.3 \n            2 kabul afghan cuisine       afghani    san carlos    3.8  \n */\n\nCREATE TABLE location\n(\n    id_restaurant INTEGER        not null\n        primary key,\n    street_num    INTEGER          null,\n    street_name   TEXT null,\n    city          TEXT null,\n    foreign key (city) references geographic (city)\n            on update cascade on delete cascade,\n    foreign key (id_restaurant) references generalinfo (id_restaurant)\n            on update cascade on delete cascade\n) \n /* \n 2 example rows: \n SELECT * FROM location LIMIT 2; \n id_restaurant street_num    street_name          city \n            1        242      church st san francisco \n            2        135 el camino real    san carlos  \n */\n\n  <br>
-- External Knowledge: Atlantic Ave refers to street_name = 'atlantic ave'; rating refers to review\n  <br>
-- Using valid SQLite and understanding External Knowledge, answer the following question for the tables provided above.\n  <br>
-- What is the rating of each restaurant reviews on Atlantic Ave?\n <br>
Generate the SQL after thinking step by step:\n  <br>
```

```python
def bird_gpt_template_no_format(question, commonsense, schema):
    return f"""{schema}

-- External Knowledge: {commonsense}
-- Using valid SQLite and understanding External Knowledge, answer the following question for the tables provided above.
-- {question}
Generate the SQL after thinking step by step:
"""

```

### Generate Schema:

```python
def generate_schema_prompt(db_path, num_rows=None):
    # extract create ddls
    '''
    :param root_place:
    :param db_name:
    :return:
    '''
    full_schema_prompt_list = []
    conn = sqlite3.connect(db_path)
    # Create a cursor object
    cursor = conn.cursor()
    cursor.execute("SELECT name FROM sqlite_master WHERE type='table'")
    tables = cursor.fetchall()
    schemas = {}
    for table in tables:
        if table == 'sqlite_sequence':
            continue
        cursor.execute("SELECT sql FROM sqlite_master WHERE type='table' AND name='{}';".format(table[0]))
        create_prompt = cursor.fetchone()[0]
        schemas[table[0]] = create_prompt
        if num_rows:
            cur_table = table[0]
            if cur_table in ['order', 'by', 'group','transaction'] or ' ' in str(cur_table).strip() or '-' in str(cur_table).strip():
                cur_table = '"{}"'.format(cur_table)


            cursor.execute("SELECT * FROM {} LIMIT {}".format(cur_table, num_rows))
            column_names = [description[0] for description in cursor.description]
            values = cursor.fetchall()
            rows_prompt = nice_look_table(column_names=column_names, values=values)
            verbose_prompt = "/* \n {} example rows: \n SELECT * FROM {} LIMIT {}; \n {} \n */".format(num_rows,
                                                                                                       cur_table,
                                                                                                       num_rows,
                                                                                                       rows_prompt)
            schemas[table[0]] = "{} \n {}".format(create_prompt, verbose_prompt)

    for k, v in schemas.items():
        full_schema_prompt_list.append(v)

    schema_prompt = "\n\n".join(full_schema_prompt_list)

    return schema_prompt
```


### System Prompt:

Default Qwen2.5 System Prompt

```python
def preprocess_prompt(prompt):
    return f'''<|im_start|>system
You are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
'''
```

### Generation Config:
- No Sampling for best Performance



### Restrictions:
- only trained on SQLite Dialect
- only trained in English
- Did not care to keep any other Skills than Text to SQL