Improve model card: Add pipeline tag, library name, GitHub link, and usage example
#1
by
nielsr
HF Staff
- opened
README.md
CHANGED
@@ -1,8 +1,109 @@
|
|
1 |
---
|
2 |
license: mit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
4 |
# 🐘EllieSQL: Cost-Efficient Text-to-SQL with Complexity-Aware Routing
|
5 |
|
6 |
Checkpoints for the trained routers in the paper *"EllieSQL: Cost-Efficient Text-to-SQL with Complexity-Aware Routing"*.
|
7 |
|
8 |
-
[](https://elliesql.github.io/)[](https://arxiv.org/abs/2503.22402)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: mit
|
3 |
+
pipeline_tag: text-classification
|
4 |
+
library_name: transformers
|
5 |
+
tags:
|
6 |
+
- text-to-sql
|
7 |
+
- router
|
8 |
+
- complexity-aware
|
9 |
+
- qwen2.5
|
10 |
---
|
11 |
+
|
12 |
# 🐘EllieSQL: Cost-Efficient Text-to-SQL with Complexity-Aware Routing
|
13 |
|
14 |
Checkpoints for the trained routers in the paper *"EllieSQL: Cost-Efficient Text-to-SQL with Complexity-Aware Routing"*.
|
15 |
|
16 |
+
[](https://elliesql.github.io/)[](https://arxiv.org/abs/2503.22402)[](https://github.com/derrickzhu/EllieSQL)
|
17 |
+
|
18 |
+
## Overview
|
19 |
+
|
20 |
+
Text-to-SQL automatically translates natural language queries to SQL, allowing non-technical users to retrieve data from databases without specialized SQL knowledge. Despite the success of advanced LLM-based Text-to-SQL approaches on leaderboards, their unsustainable computational costs—often overlooked—stand as the "elephant in the room" in current leaderboard-driven research, limiting their economic practicability for real-world deployment and widespread adoption.
|
21 |
+
|
22 |
+
To tackle this, we exploratively propose EllieSQL, a complexity-aware routing framework that assigns queries to suitable SQL generation pipelines based on estimated complexity. We investigate multiple routers to direct simple queries to efficient approaches while reserving computationally intensive methods for complex cases. Drawing from economics, we introduce the Token Elasticity of Performance (TEP) metric, capturing cost-efficiency by quantifying the responsiveness of performance gains relative to token investment in SQL generation. Experiments show that compared to always using the most advanced methods in our study, EllieSQL with the Qwen2.5-0.5B-DPO router reduces token use by over 40% without compromising performance on Bird development set, achieving more than a 2x boost in TEP over non-routing approaches. This not only advances the pursuit of cost-efficient Text-to-SQL but also invites the community to weigh resource efficiency alongside performance, contributing to progress in sustainable Text-to-SQL.
|
23 |
+
|
24 |
+
## Usage
|
25 |
+
|
26 |
+
This repository contains checkpoints for the EllieSQL routers, which are designed to classify the complexity of natural language queries for Text-to-SQL. These models are PEFT (LoRA) adapters, meaning they are fine-tuned on top of a base model. To use them, you will first need to load the corresponding base model and then apply the adapter.
|
27 |
+
|
28 |
+
First, ensure you have the necessary libraries installed:
|
29 |
+
```bash
|
30 |
+
pip install transformers peft torch
|
31 |
+
```
|
32 |
+
|
33 |
+
Here's an example of how to load and use one of the router checkpoints (e.g., a Qwen2.5-0.5B based router):
|
34 |
+
|
35 |
+
```python
|
36 |
+
import torch
|
37 |
+
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
38 |
+
from peft import PeftModel, PeftConfig
|
39 |
+
|
40 |
+
# The model_id refers to the path on the Hugging Face Hub where the PEFT adapter is located.
|
41 |
+
peft_model_id = "derrickzhu/EllieSQL_Router_Checkpoints"
|
42 |
+
|
43 |
+
# Load the PEFT configuration. This helps identify the base model if it's not explicitly named.
|
44 |
+
config = PeftConfig.from_pretrained(peft_model_id)
|
45 |
+
|
46 |
+
# The base model for these routers is typically Qwen2.5-0.5B, as indicated in the paper and adapter configs.
|
47 |
+
# Make sure to use the correct Hugging Face Hub ID for the base model.
|
48 |
+
base_model_name_or_path = "Qwen/Qwen2.5-0.5B"
|
49 |
+
|
50 |
+
# Load the tokenizer for the base model.
|
51 |
+
tokenizer = AutoTokenizer.from_pretrained(base_model_name_or_path)
|
52 |
+
|
53 |
+
# The router performs sequence classification (e.g., simple, medium, complex queries).
|
54 |
+
# The `config.json` inside the repository indicates `num_labels=3` for this classification task.
|
55 |
+
model = AutoModelForSequenceClassification.from_pretrained(
|
56 |
+
base_model_name_or_path,
|
57 |
+
num_labels=3,
|
58 |
+
torch_dtype=torch.bfloat16, # Adjust dtype based on your hardware and desired precision
|
59 |
+
low_cpu_mem_usage=True # Optimize memory usage for large models
|
60 |
+
)
|
61 |
+
|
62 |
+
# Load the PEFT adapter weights onto the base model.
|
63 |
+
model = PeftModel.from_pretrained(model, peft_model_id)
|
64 |
+
model.eval() # Set the model to evaluation mode
|
65 |
+
model.to("cuda") # Move the model to GPU if available for faster inference
|
66 |
+
|
67 |
+
# Example natural language query to classify
|
68 |
+
text_query = "Find the names of all students who scored above 90 in Math and live in New York."
|
69 |
+
|
70 |
+
# Tokenize the input query.
|
71 |
+
inputs = tokenizer(text_query, return_tensors="pt").to(model.device)
|
72 |
+
|
73 |
+
# Perform inference to get the classification logits.
|
74 |
+
with torch.no_grad():
|
75 |
+
outputs = model(**inputs)
|
76 |
+
logits = outputs.logits
|
77 |
+
# Get the predicted class ID (0, 1, or 2).
|
78 |
+
predicted_class_id = torch.argmax(logits, dim=-1).item()
|
79 |
+
|
80 |
+
# Map the predicted ID back to a human-readable label.
|
81 |
+
# The `id2label` mapping is typically found in the model's `config.json`.
|
82 |
+
# For this model, labels are "LABEL_0", "LABEL_1", "LABEL_2".
|
83 |
+
# Refer to the EllieSQL paper or project documentation for the exact mapping of these labels
|
84 |
+
# to complexity categories (e.g., LABEL_0 -> "simple", LABEL_1 -> "medium", LABEL_2 -> "complex").
|
85 |
+
id_to_label = model.config.id2label
|
86 |
+
predicted_label = id_to_label[predicted_class_id]
|
87 |
+
|
88 |
+
print(f"Text query: '{text_query}'")
|
89 |
+
print(f"Predicted raw label: {predicted_label}")
|
90 |
+
# If the exact mapping to complexity is known, e.g.:
|
91 |
+
# complexity_map = {"LABEL_0": "simple", "LABEL_1": "medium", "LABEL_2": "complex"}
|
92 |
+
# print(f"Predicted complexity: {complexity_map.get(predicted_label, 'unknown')}")
|
93 |
+
```
|
94 |
+
|
95 |
+
## Citation
|
96 |
+
|
97 |
+
If you find our work useful or inspiring, please kindly cite:
|
98 |
+
|
99 |
+
```bibtex
|
100 |
+
@misc{zhu2025elliesql,
|
101 |
+
title={EllieSQL: Cost-Efficient Text-to-SQL with Complexity-Aware Routing},
|
102 |
+
author={Yizhang Zhu and Runzhi Jiang and Boyan Li and Nan Tang and Yuyu Luo},
|
103 |
+
year={2025},
|
104 |
+
eprint={2503.22402},
|
105 |
+
archivePrefix={arXiv},
|
106 |
+
primaryClass={cs.DB},
|
107 |
+
url={https://arxiv.org/abs/2503.22402},
|
108 |
+
}
|
109 |
+
```
|