Jan Reges

janreges3

·

AI & ML interests

None yet

Recent Activity

new activity about 17 hours ago

nvidia/Qwen3.6-27B-NVFP4:NVFP4 vs FP8 on an RTX PRO 6000 Blackwell (vLLM 0.24): sadly, FP4 is still not faster than FP8 - Qwen3.6‑27B

new activity 2 months ago

rico03/Qwen3.6-27B-Claude-Opus-Reasoning-Distilled:Chat-template fix: occasional </think> leak + body duplication on long-context tasks (patched template attached)

new activity 2 months ago

rico03/Qwen3.6-27B-Claude-Opus-Reasoning-Distilled:Thanks and request for FP8 version

View all activity

Organizations

None yet

New activity in nvidia/Qwen3.6-27B-NVFP4 about 17 hours ago

NVFP4 vs FP8 on an RTX PRO 6000 Blackwell (vLLM 0.24): sadly, FP4 is still not faster than FP8 - Qwen3.6‑27B

#8 opened about 17 hours ago by

New activity in rico03/Qwen3.6-27B-Claude-Opus-Reasoning-Distilled 2 months ago

Chat-template fix: occasional </think> leak + body duplication on long-context tasks (patched template attached)

#3 opened 2 months ago by

Thanks and request for FP8 version

#2 opened 2 months ago by

New activity in Qwen/Qwen3.5-35B-A3B 4 months ago

vLLM - Looping prevention

#39 opened 4 months ago by

New activity in Qwen/Qwen3-Embedding-4B 8 months ago

How to set MRL (variable dimensions) in vLLM

#21 opened 8 months ago by

New activity in BCCard/Qwen3-235B-A22B-Thinking-2507-NVFP4A16 10 months ago

Request for NVFP4A16 model Qwen3-30B-A3B 2507 (Thinking & Instruct)

#1 opened 10 months ago by