ChessLM Qwen3 - Neuron Traced (AWS Format Structure)
This is a Neuron-traced version of karanps/ChessLM_Qwen3 optimized for AWS Trainium (trn2) instances using vLLM.
This model follows the AWS Neuron repository structure with separate directories for compiled artifacts.
This model is meant to be used from within the Neuron Workshop (https://github.com/aws-neuron/neuron-workshops)
Model Details
- Base Model: Qwen3-8B fine-tuned for chess
- Compilation: optimum-neuron[vllm]==0.3.0
- Compiler Version: neuronxcc 2.21.33363.0
- Target Hardware: AWS Trainium2 (trn2)
- Precision: BF16
- Tensor Parallelism: 2 cores
- Batch Size: 4 (continuous batching enabled)
- Max Sequence Length: 2048
Compilation instructions
optimum-cli export neuron \
--model karanps/ChessLM_Qwen3 \
--task text-generation \
--sequence_length 2048 \
--batch_size 4 \
/home/ubuntu/environment/ml/qwen-chess/karanps/ChessLM_Qwen3_compiled
Key Files
- context_encoding_model/: Compiled NEFF files for processing initial prompt sequences (up to 2048 tokens)
- token_generation_model/: Compiled NEFF files for autoregressive token generation
- layout_opt/: Layout optimization artifacts from compilation
- model.pt: Main model file containing compiled graphs and embedded weights (17GB)
- neuron_config.json: Neuron compilation configuration
Model Files
| File | Purpose |
|---|---|
| model.pt | Main model with embedded weights (17GB) |
| config.json | Base model configuration |
| neuron_config.json | Neuron compilation settings |
| tokenizer* | Tokenizer files for text processing |
License
This model inherits the license from the base model karanps/ChessLM_Qwen3.
- Downloads last month
- -
Model tree for aws-neuron/ChessLM_Qwen3_Trainium_2_AWS_Format
Base model
karanps/ChessLM_Qwen3