TransformerModel / README.md
Shriti09's picture
corrected readme.md
1a72dc2
|
raw
history blame
1.59 kB
---
title: "GPT Transformer Text Generator"
emoji: "🤖"
colorFrom: "blue"
colorTo: "green"
sdk: "gradio"
sdk_version: "3.0.0"
app_file: "app.py"
pinned: false
---
# GPT Transformer Model
This repository contains a GPT-like transformer model built using PyTorch for natural language generation. The model is based on the architecture introduced in GPT-2, which has been trained on a custom dataset for text generation.
## Model Overview
The model is a multi-layer transformer-based neural network, consisting of the following components:
- **Causal Self-Attention:** A core component of the transformer that performs self-attention to process the input sequence.
- **MLP (Feedforward Layer):** Applied to each block in the transformer, which helps the model to learn complex relationships.
- **Layer Normalization:** Applied before each attention and feedforward layer to stabilize training.
- **Embedding Layers:** Token embeddings for words and positional embeddings for the sequence.
### Architecture
- **Embedding Dimension (`n_embd`)**: 768
- **Number of Attention Heads (`n_head`)**: 12
- **Number of Layers (`n_layer`)**: 12
- **Vocabulary Size (`vocab_size`)**: 50,257
- **Max Sequence Length (`block_size`)**: 1024
The model is trained for text generation and can be fine-tuned with custom data.
## Requirements
To run the model and perform inference, you will need the following dependencies:
- Python 3.7+
- PyTorch
- Gradio
- Transformers
- Tokenizers (GPT-2)
You can install the required libraries using:
```bash
pip install torch gradio transformers tiktoken