microsoft
/

Phi-4-reasoning-plus

@@ -56,7 +56,7 @@ library_name: transformers
 ## Usage
 > [!IMPORTANT]
-> To fully take advantage of the model's capabilities, inference must use `temperature=0.8`, `top_p=0.95`, and `do_sample=True`. For more complex queries, set `max_new_tokens=32768` to allow for longer chain-of-thought (CoT).
 *Phi-4-reasoning-plus has shown strong performance on reasoning-intensive tasks. In our experiments, we extended its maximum number of tokens to 64k, and it handled longer sequences with promising results, maintaining coherence and logical consistency over extended inputs. This makes it a compelling option to explore for tasks that require deep, multi-step reasoning or extensive context.*
@@ -90,6 +90,7 @@ outputs = model.generate(
     inputs.to(model.device),
     max_new_tokens=4096,
     temperature=0.8,
     top_p=0.95,
     do_sample=True,
 )

 ## Usage
 > [!IMPORTANT]
+> To fully take advantage of the model's capabilities, inference must use `temperature=0.8`, `top_k=50`, `top_p=0.95`, and `do_sample=True`. For more complex queries, set `max_new_tokens=32768` to allow for longer chain-of-thought (CoT).
 *Phi-4-reasoning-plus has shown strong performance on reasoning-intensive tasks. In our experiments, we extended its maximum number of tokens to 64k, and it handled longer sequences with promising results, maintaining coherence and logical consistency over extended inputs. This makes it a compelling option to explore for tasks that require deep, multi-step reasoning or extensive context.*
     inputs.to(model.device),
     max_new_tokens=4096,
     temperature=0.8,
+    top_k=50,
     top_p=0.95,
     do_sample=True,
 )

generation_config.json CHANGED Viewed

@@ -5,6 +5,7 @@
   "eos_token_id": 100265,
   "pad_token_id": 100349,
   "temperature": 0.8,
   "top_p": 0.95,
   "transformers_version": "4.51.1"
 }

   "eos_token_id": 100265,
   "pad_token_id": 100349,
   "temperature": 0.8,
+  "top_k": 50,
   "top_p": 0.95,
   "transformers_version": "4.51.1"
 }