ChessLM Qwen3 - Neuron Traced (AWS Format Structure)

This is a Neuron-traced version of karanps/ChessLM_Qwen3 optimized for AWS Trainium (trn2) instances using vLLM.

This model follows the AWS Neuron repository structure with separate directories for compiled artifacts.

This model is meant to be used from within the Neuron Workshop (https://github.com/aws-neuron/neuron-workshops)

Model Details

  • Base Model: Qwen3-8B fine-tuned for chess
  • Compilation: optimum-neuron[vllm]==0.3.0
  • Compiler Version: neuronxcc 2.21.33363.0
  • Target Hardware: AWS Trainium2 (trn2)
  • Precision: BF16
  • Tensor Parallelism: 2 cores
  • Batch Size: 4 (continuous batching enabled)
  • Max Sequence Length: 2048

Compilation instructions

optimum-cli export neuron \
  --model karanps/ChessLM_Qwen3 \
  --task text-generation \
  --sequence_length 2048 \
  --batch_size 4 \
  /home/ubuntu/environment/ml/qwen-chess/karanps/ChessLM_Qwen3_compiled

Key Files

  • context_encoding_model/: Compiled NEFF files for processing initial prompt sequences (up to 2048 tokens)
  • token_generation_model/: Compiled NEFF files for autoregressive token generation
  • layout_opt/: Layout optimization artifacts from compilation
  • model.pt: Main model file containing compiled graphs and embedded weights (17GB)
  • neuron_config.json: Neuron compilation configuration

Model Files

File Purpose
model.pt Main model with embedded weights (17GB)
config.json Base model configuration
neuron_config.json Neuron compilation settings
tokenizer* Tokenizer files for text processing

License

This model inherits the license from the base model karanps/ChessLM_Qwen3.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for aws-neuron/ChessLM_Qwen3_Trainium_2_AWS_Format

Finetuned
(5)
this model