Dhahlan2000 commited on
Commit
02386cb
·
1 Parent(s): 536ba94

Reduce max_new_tokens in model generation from 2048 to 512 in app.py to optimize response length and improve performance. This change aims to enhance the efficiency of the conversation prediction function.

Browse files
Files changed (1) hide show
  1. app.py +1 -1
app.py CHANGED
@@ -141,7 +141,7 @@ def conversation_predict(input_text: str, cv_sections: Dict[str, str]):
141
  # Generate a response with the model
142
  outputs = model.generate(
143
  input_ids,
144
- max_new_tokens=2048,
145
  temperature=0.7,
146
  top_p=0.95,
147
  do_sample=True
 
141
  # Generate a response with the model
142
  outputs = model.generate(
143
  input_ids,
144
+ max_new_tokens=512,
145
  temperature=0.7,
146
  top_p=0.95,
147
  do_sample=True