Timothy-Vinzent commited on
Commit
5771b1d
·
verified ·
1 Parent(s): 33a1d1f

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +26 -11
app.py CHANGED
@@ -224,10 +224,11 @@ def build_interface():
224
  Constructs the Gradio interface with a submission button and single-submission mechanism.
225
  """
226
  with gr.Blocks() as demo:
227
- gr.Markdown("# GPT-4o Mini System Prompt Submission")
 
228
  # General description
229
- gr.Markdown("""Classification Task: Document and Clause Level Identification
230
- Participants must create a system prompt for a language model that classifies user queries about legal documents into two specific categories:
231
  1. **Document Level**: Determines whether the query refers to a single document or multiple documents.
232
  2. **Clause Level**: Identifies whether the query is focused on:
233
  - A single clause,
@@ -243,11 +244,11 @@ def build_interface():
243
  }
244
  ```
245
 
246
- The goal is to ensure that the model's output is concise, structured, and accurate. This task is designed to evaluate the robustness of the system prompt in handling classification tasks with short, precise outputs.
247
  """)
248
 
249
  # Example Inputs and Outputs in an Accordion
250
- with gr.Accordion("Example Inputs and Expected Outputs", open=False):
251
  gr.Markdown("""
252
  1. **User Message Example 1:**
253
  - *"Please provide the contract for the lease agreement."*
@@ -300,12 +301,14 @@ def build_interface():
300
  """)
301
 
302
  # Challenge instructions in another Accordion
303
- with gr.Accordion("Challenge Instructions", open=False):
304
  gr.Markdown("""
305
- - Design a system prompt that ensures the AI generates outputs like those above when given similar user messages.
306
 
307
  The system prompt should:
308
- 1. Specify formatting requirements (e.g., *"Output must be a valid JSON object"*). Note that we are not using constrained decoding or any sort of JSON mode; if not correctly prompted, the LLM will output plain text.
 
 
309
  2. Emphasize strict adherence to classification definitions:
310
  - *Single Document:* Refers to one document.
311
  - *Multiple Documents:* Refers to more than one document.
@@ -313,12 +316,24 @@ def build_interface():
313
  - *Multiple Clauses:* Refers to more than one specific clause.
314
  - *General Information:* Refers to general content not tied to specific clauses.
315
 
316
- You can only submit once, so test your system prompt thoroughly before submission!
 
 
 
 
 
 
 
 
317
  """)
318
 
319
  gr.Markdown(
320
- "Please enter your details and submit your system prompt below. "
321
- "You can only submit once, I suggest trying to test and build out the system prompt using the same LM being used here elsewhere before submitting."
 
 
 
 
322
  )
323
 
324
  email_input = gr.Textbox(label="Email", placeholder="[email protected]")
 
224
  Constructs the Gradio interface with a submission button and single-submission mechanism.
225
  """
226
  with gr.Blocks() as demo:
227
+ gr.Markdown("# System Prompt Applicant Task")
228
+ gr.Markdown("## Document and Clause Level Classification")
229
  # General description
230
+ gr.Markdown("""
231
+ Applicants must create a system prompt for a language model that classifies user queries about legal documents into two specific categories:
232
  1. **Document Level**: Determines whether the query refers to a single document or multiple documents.
233
  2. **Clause Level**: Identifies whether the query is focused on:
234
  - A single clause,
 
244
  }
245
  ```
246
 
247
+ The goal is to ensure that the model's output adheres to the precscibed JSON structure and accurately classifies 7 test queries into the two respective categories. This task is designed to evaluate your prompting, by adhering to the required structure without any constrained decoding or "JSON mode" while providing correct responses at the same time.
248
  """)
249
 
250
  # Example Inputs and Outputs in an Accordion
251
+ with gr.Accordion("**Example Inputs and Expected Outputs**", open=False):
252
  gr.Markdown("""
253
  1. **User Message Example 1:**
254
  - *"Please provide the contract for the lease agreement."*
 
301
  """)
302
 
303
  # Challenge instructions in another Accordion
304
+ with gr.Accordion("**Challenge Instructions**", open=False):
305
  gr.Markdown("""
306
+ - Design a system prompt that ensures gpt4o-mini generates outputs like those above when given similar user messages.
307
 
308
  The system prompt should:
309
+ 1. Specify formatting requirements (e.g., *"Output must be a valid JSON object"*).
310
+ - Note that we are not using constrained decoding or any sort of JSON mode; if not correctly prompted, the LLM will output plain text.
311
+ - All LLM responses will be passed to json.loads(response), responses that fail the json parsing are deemed incorrect (beware of tripple backtricks etc.)
312
  2. Emphasize strict adherence to classification definitions:
313
  - *Single Document:* Refers to one document.
314
  - *Multiple Documents:* Refers to more than one document.
 
316
  - *Multiple Clauses:* Refers to more than one specific clause.
317
  - *General Information:* Refers to general content not tied to specific clauses.
318
 
319
+ **You can only submit once, so test your system prompt thoroughly before submission!**
320
+
321
+ You will be scored according to the following criteria with respect to the outputs of 7 test user messages
322
+ - Response is valid JSON
323
+ - The response contains the keys: "document_level" and "clause_level"
324
+ - The values for each of the keys are correct
325
+
326
+ Good Luck!
327
+
328
  """)
329
 
330
  gr.Markdown(
331
+ """Please enter the same name and email as listed in your CV and submit your system prompt below.
332
+
333
+ You can only submit once, try to test and build out your system prompt using gpt4o-mini with temp=1 before submitting your solution.
334
+
335
+ We look forward to your submission!
336
+ """
337
  )
338
 
339
  email_input = gr.Textbox(label="Email", placeholder="[email protected]")