tngtech
/

DeepSeek-R1T-Chimera

Text Generation

text-generation-inference

Model card Files Files and versions

DeepSeek-R1T-Chimera

Benchmarks

Model merge of DeepSeek-R1 and DeepSeek-V3 (0324)

An open weights model combining the intelligence of R1 with the token efficiency of V3.

Announcement on X | LinkedIn post | Try it on OpenRouter

Model Details

Architecture: DeepSeek-MoE Transformer-based language model
Combination Method: Merged model weights from DeepSeek-R1 and DeepSeek-V3 (0324)
Release Date: 2025-04-27

Use, Out-of-scope Use, Limitations, Risks, Recommendations et al

Regarding R1T Chimera, we ask you to follow the careful guidelines that Microsoft has created for their "MAI-DS-R1" DeepSeek-based model.

These guidelines are available here on Hugging Face.

Contact

Email: [email protected]
X.com: @tngtech

Citation

@misc{tng_technology_consulting_gmbh_2025,
    author       = { TNG Technology Consulting GmbH },
    title        = { DeepSeek-R1T-Chimera },
    year         = 2025,
    month        = {April},
    url          = { https://huggingface.co/tngtech/DeepSeek-R1T-Chimera },
    doi          = { 10.57967/hf/5330 },
    publisher    = { Hugging Face }
}

Downloads last month: 3,554

Safetensors

Model size

685B params

Tensor type

F32

·

BF16

·

F8_E4M3

·

Inference Providers NEW

Text Generation

This model isn't deployed by any Inference Provider. 🙋 37 Ask for provider support

Model tree for tngtech/DeepSeek-R1T-Chimera

Base model

deepseek-ai/DeepSeek-R1

Quantized

(53)

this model

Quantizations

1 model

Spaces using tngtech/DeepSeek-R1T-Chimera 5