dolphinium
commited on
Commit
·
fb3b328
1
Parent(s):
4ee8b9e
dimension and measure rules updated
Browse files- llm_prompts.py +30 -0
llm_prompts.py
CHANGED
@@ -51,7 +51,10 @@ You are an expert data analyst and Solr query engineer. Your task is to convert
|
|
51 |
2. **Field Usage**: You MUST use the fields described in the 'Field Definitions'. Pay close attention to the definitions to select the correct field, especially the `_s` fields for searching. Do not use fields ending with `_s` in `group.field` or facet `field` unless necessary for the analysis.
|
52 |
3. **Dimension vs. Measure**:
|
53 |
* `analysis_dimension`: The primary categorical field the user wants to group by (e.g., `company_name`, `route_branch`). This is the `group by` field.
|
|
|
|
|
54 |
* `analysis_measure`: The metric to aggregate (e.g., `sum(total_deal_value_in_million)`) or the method of counting (`count`).
|
|
|
55 |
* `sort_field_for_examples`: The raw field used to find the "best" example. If `analysis_measure` is `sum(field)`, this should be `field`. If `analysis_measure` is `count`, this should be a relevant field like `date`.
|
56 |
4. **Crucial Sorting Rules**:
|
57 |
* For `group.sort`: If `analysis_measure` involves a function on a field (e.g., `sum(total_deal_value_in_million)`), you MUST use the full function: `group.sort: 'sum(total_deal_value_in_million) desc'`.
|
@@ -128,6 +131,33 @@ You are an expert data analyst and Solr query engineer. Your task is to convert
|
|
128 |
}}
|
129 |
}}
|
130 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
131 |
---
|
132 |
### YOUR TASK
|
133 |
|
|
|
51 |
2. **Field Usage**: You MUST use the fields described in the 'Field Definitions'. Pay close attention to the definitions to select the correct field, especially the `_s` fields for searching. Do not use fields ending with `_s` in `group.field` or facet `field` unless necessary for the analysis.
|
52 |
3. **Dimension vs. Measure**:
|
53 |
* `analysis_dimension`: The primary categorical field the user wants to group by (e.g., `company_name`, `route_branch`). This is the `group by` field.
|
54 |
+
Understand the main categories in data according to the sample list. If user didn't mention a category group try to find categories. Like if user tries to differentiate cancer vs. infection she is related with therapeutic categories. If oral vs injection drug delivery branches. If she asks just recent news try to conceive which field is most relevant like deal types.
|
55 |
+
DO NOT CHOOSE SAME DIMENSION IF YOU USE IT ON query filter
|
56 |
* `analysis_measure`: The metric to aggregate (e.g., `sum(total_deal_value_in_million)`) or the method of counting (`count`).
|
57 |
+
Try to find what differentiate most relevant entries. If user specifies sth. concentrate on that else find most conspicious / important looking field like deal_value. Make sure it's mostly filled.
|
58 |
* `sort_field_for_examples`: The raw field used to find the "best" example. If `analysis_measure` is `sum(field)`, this should be `field`. If `analysis_measure` is `count`, this should be a relevant field like `date`.
|
59 |
4. **Crucial Sorting Rules**:
|
60 |
* For `group.sort`: If `analysis_measure` involves a function on a field (e.g., `sum(total_deal_value_in_million)`), you MUST use the full function: `group.sort: 'sum(total_deal_value_in_million) desc'`.
|
|
|
131 |
}}
|
132 |
}}
|
133 |
```
|
134 |
+
|
135 |
+
**User Query 3:** "give me recent news on USA drug approvals"
|
136 |
+
**Correct JSON Output 3:**
|
137 |
+
```json
|
138 |
+
{{
|
139 |
+
"analysis_dimension": "company_name",
|
140 |
+
"analysis_measure": "count",
|
141 |
+
"sort_field_for_examples": "date",
|
142 |
+
"query_filter": "territory_hq_s:"united states of america" AND news_type:"product approvals" AND date_year:{datetime.datetime.now().year}",
|
143 |
+
"quantitative_request": {{
|
144 |
+
"json.facet": {{
|
145 |
+
"news_by_company_name": {{
|
146 |
+
"type": "terms",
|
147 |
+
"field": "company_name",
|
148 |
+
"limit": 10,
|
149 |
+
"sort": "count desc"
|
150 |
+
}}
|
151 |
+
}}
|
152 |
+
}},
|
153 |
+
"qualitative_request": {{
|
154 |
+
"group": true,
|
155 |
+
"group.field": "company_name",
|
156 |
+
"group.limit": 1,
|
157 |
+
"sort": "date desc"
|
158 |
+
}}
|
159 |
+
}}
|
160 |
+
```
|
161 |
---
|
162 |
### YOUR TASK
|
163 |
|