abhinavv3 commited on
Commit
329492e
Β·
verified Β·
1 Parent(s): a6124fc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +39 -9
README.md CHANGED
@@ -31,16 +31,46 @@ This model is designed for scalable training, long-context understanding, and ef
31
 
32
  ## πŸ“ Project Structure
33
 
34
- memGPT/
35
- β”œβ”€β”€ configs/ β†’ Training & model hyperparams
36
- β”œβ”€β”€ data/ β†’ Tokenized and sharded datasets
37
- β”œβ”€β”€ model_core/ β†’ Model + attention + dataloader logic
38
- β”œβ”€β”€ scripts/ β†’ Training, evaluation, generation scripts
39
- β”œβ”€β”€ evaluation/ β†’ HellaSwag benchmark evaluation
40
- β”œβ”€β”€ logs/ β†’ Checkpoints and logs
41
- β”œβ”€β”€ requirements.txt β†’ Python dependencies
42
- └── README.md β†’ This model card
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43
 
 
44
 
45
  ---
46
 
 
31
 
32
  ## πŸ“ Project Structure
33
 
34
+ ```bash
35
+ MEM_TRANSFORMER/
36
+ β”œβ”€β”€ configs/
37
+ β”‚ └── config.json # Model + training hyperparameters
38
+ β”‚
39
+ β”œβ”€β”€ data/
40
+ β”‚ β”œβ”€β”€ edu_fineweb/ # Token-sharded training data
41
+ β”‚ β”‚ β”œβ”€β”€ train_000001.npy
42
+ β”‚ β”‚ β”œβ”€β”€ train_000002.npy
43
+ β”‚ β”‚ └── test_000001.npy
44
+ β”‚ β”œβ”€β”€ hellaswag/
45
+ β”‚ β”‚ └── hellaswag_val.jsonl
46
+ β”‚ └── fineweb.py # Sharding logic with memory-aligned sequence control
47
+ β”‚
48
+ β”œβ”€β”€ model_core/
49
+ β”‚ β”œβ”€β”€ __init__.py
50
+ β”‚ β”œβ”€β”€ attention.py # Grouped Query Attention, KNN & XL attention logic.Rotary Positional Encoding implementation
51
+ β”‚ β”œβ”€β”€ model.py # Transformer model with memory and RoPE support
52
+ β”‚ β”œβ”€β”€ dataloader.py # Memory-aware DataLoader
53
+ β”‚ └── training.py # train_memgpt function
54
+ β”‚
55
+ β”œβ”€β”€ scripts/
56
+ β”‚ β”œβ”€β”€ train.py # Training script (DDP-compatible)
57
+ β”‚ β”œβ”€β”€ evaluate.py # Evaluation on benchmarks
58
+ β”‚ └── generate.py # Text generation from trained model
59
+ β”‚
60
+ β”œβ”€β”€ evaluation/
61
+ β”‚ β”œβ”€β”€ __init__.py
62
+ β”‚ β”œβ”€β”€ hellaswag.py # HellaSwag data loader
63
+ β”‚ └── val_hellaswag.py # Evaluation logic with loss-based scoring
64
+ β”‚
65
+ β”œβ”€β”€ logs/
66
+ β”‚ β”œβ”€β”€ log.txt # Training logs
67
+ β”‚ └── model_*.pt # Checkpoints
68
+ β”‚
69
+ β”œβ”€β”€ .gitignore
70
+ β”œβ”€β”€ README.md
71
+ β”œβ”€β”€ requirements.txt
72
 
73
+ ```
74
 
75
  ---
76