ChessLM Qwen3 - Neuron Traced (AWS Format Structure)

This is a Neuron-traced version of karanps/ChessLM_Qwen3 optimized for AWS Trainium (trn2) instances using vLLM.

This model follows the AWS Neuron repository structure with separate directories for compiled artifacts.

This model is meant to be used from within the Neuron Workshop (https://github.com/aws-neuron/neuron-workshops)

Model Details

Base Model: Qwen3-8B fine-tuned for chess
Compilation: optimum-neuron[vllm]==0.3.0
Compiler Version: neuronxcc 2.21.33363.0
Target Hardware: AWS Trainium2 (trn2)
Precision: BF16
Tensor Parallelism: 2 cores
Batch Size: 4 (continuous batching enabled)
Max Sequence Length: 2048

Compilation instructions

optimum-cli export neuron \
  --model karanps/ChessLM_Qwen3 \
  --task text-generation \
  --sequence_length 2048 \
  --batch_size 4 \
  /home/ubuntu/environment/ml/qwen-chess/karanps/ChessLM_Qwen3_compiled

Key Files

context_encoding_model/: Compiled NEFF files for processing initial prompt sequences (up to 2048 tokens)
token_generation_model/: Compiled NEFF files for autoregressive token generation
layout_opt/: Layout optimization artifacts from compilation
model.pt: Main model file containing compiled graphs and embedded weights (17GB)
neuron_config.json: Neuron compilation configuration

Model Files

File	Purpose
model.pt	Main model with embedded weights (17GB)
config.json	Base model configuration
neuron_config.json	Neuron compilation settings
tokenizer*	Tokenizer files for text processing

License

This model inherits the license from the base model karanps/ChessLM_Qwen3.

Downloads last month: -

Model tree for aws-neuron/ChessLM_Qwen3_Trainium_2_AWS_Format

Base model

karanps/ChessLM_Qwen3

Finetuned

(5)

this model