Spaces:

Dhahlan2000
/

AppyJob

Sleeping

Dhahlan2000 commited on Jan 6

Commit

02386cb

1 Parent(s): 536ba94

Reduce max_new_tokens in model generation from 2048 to 512 in app.py to optimize response length and improve performance. This change aims to enhance the efficiency of the conversation prediction function.

Files changed (1) hide show

app.py CHANGED Viewed

@@ -141,7 +141,7 @@ def conversation_predict(input_text: str, cv_sections: Dict[str, str]):
     # Generate a response with the model
     outputs = model.generate(
         input_ids,
-        max_new_tokens=2048,
         temperature=0.7,
         top_p=0.95,
         do_sample=True

     # Generate a response with the model
     outputs = model.generate(
         input_ids,
+        max_new_tokens=512,
         temperature=0.7,
         top_p=0.95,
         do_sample=True