TransformerModel / README.md
Shriti09's picture
Update README.md
387b881 verified
---
title: GPT Transformer Text Generator
emoji: 🤖
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 5.12.0
app_file: app.py
pinned: false
---
# GPT Transformer Model
This repository contains a GPT-like transformer model built using PyTorch for natural language generation. The model is based on the architecture introduced in GPT-2, which has been trained on a custom dataset for text generation.
## Model Overview
The model is a multi-layer transformer-based neural network, consisting of the following components:
- **Causal Self-Attention:** A core component of the transformer that performs self-attention to process the input sequence.
- **MLP (Feedforward Layer):** Applied to each block in the transformer, which helps the model to learn complex relationships.
- **Layer Normalization:** Applied before each attention and feedforward layer to stabilize training.
- **Embedding Layers:** Token embeddings for words and positional embeddings for the sequence.
### Architecture
- **Embedding Dimension (`n_embd`)**: 768
- **Number of Attention Heads (`n_head`)**: 12
- **Number of Layers (`n_layer`)**: 12
- **Vocabulary Size (`vocab_size`)**: 50,257
- **Max Sequence Length (`block_size`)**: 1024
The model is trained for text generation and can be fine-tuned with custom data.
## Requirements
To run the model and perform inference, you will need the following dependencies:
- Python 3.7+
- PyTorch
- Gradio
- Transformers
- Tokenizers (GPT-2)
You can install the required libraries using:
```bash
pip install torch gradio transformers tiktoken