Spaces:

Shriti09
/

TransformerModel

Running

App Files Files Community

TransformerModel / README.md

Shriti09

Update README.md

387b881 verified 8 months ago

preview code

raw

history blame contribute delete

1.57 kB

	---
	title: GPT Transformer Text Generator
	emoji: 🤖
	colorFrom: blue
	colorTo: green
	sdk: gradio
	sdk_version: 5.12.0
	app_file: app.py
	pinned: false
	---

	# GPT Transformer Model

	This repository contains a GPT-like transformer model built using PyTorch for natural language generation. The model is based on the architecture introduced in GPT-2, which has been trained on a custom dataset for text generation.

	## Model Overview

	The model is a multi-layer transformer-based neural network, consisting of the following components:

	- Causal Self-Attention: A core component of the transformer that performs self-attention to process the input sequence.
	- MLP (Feedforward Layer): Applied to each block in the transformer, which helps the model to learn complex relationships.
	- Layer Normalization: Applied before each attention and feedforward layer to stabilize training.
	- Embedding Layers: Token embeddings for words and positional embeddings for the sequence.

	### Architecture
	- Embedding Dimension (`n_embd`): 768
	- Number of Attention Heads (`n_head`): 12
	- Number of Layers (`n_layer`): 12
	- Vocabulary Size (`vocab_size`): 50,257
	- Max Sequence Length (`block_size`): 1024

	The model is trained for text generation and can be fine-tuned with custom data.

	## Requirements

	To run the model and perform inference, you will need the following dependencies:

	- Python 3.7+
	- PyTorch
	- Gradio
	- Transformers
	- Tokenizers (GPT-2)

	You can install the required libraries using:

	```bash
	pip install torch gradio transformers tiktoken