File size: 3,145 Bytes
2722da6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
341f752
 
 
 
 
 
2722da6
 
 
 
 
341f752
2722da6
dc7097f
341f752
 
5463b83
341f752
2722da6
 
 
 
 
 
 
 
 
 
6559a5c
2722da6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7f377b6
2722da6
 
dc7097f
2722da6
 
 
 
 
dc7097f
 
2722da6
 
64d7d74
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  <title>Vortex Language Model (VLM) Documentation</title>
  <style>
    body {
      background-color: #121212;
      color: #e0e0e0;
      font-family: Arial, sans-serif;
      line-height: 1.6;
      padding: 2rem;
    }
    h1, h2, h3 {
      color: #ffffff;
    }
    code {
      background-color: #1e1e1e;
      padding: 2px 4px;
      border-radius: 4px;
      color: #c0caf5;
    }
    .section {
      margin-bottom: 2rem;
    }
    a {
      color: #82aaff;
    }
    .note {
      background-color: #2a2a2a;
      padding: 1rem;
      border-left: 4px solid #82aaff;
      margin: 1rem 0;
    }
  </style>
</head>
<body>
  <h1>Vortex Language Model (VLM) Documentation</h1>

 <div class="section">
    <h2>Overview</h2>
    <p><strong>VLM</strong> stands for <strong>Vortex Language Model</strong>, a series of transformer-based models developed by <strong>Vortex Intelligence</strong>. The models are designed for tasks such as text generation, reasoning, and instruction following. Each version of VLM is structured in three training stages for progressive refinement.</p>
    
    <div class="note">
      <p><strong>Note on Hardware:</strong> Training is performed on different GPU configurations depending on availability - sometimes using 1× NVIDIA RTX A5000, sometimes 1× A40, and occasionally 1× RTX 5060 Ti with 16GB VRAM.</p>
    </div>
  </div>

  <div class="section">
    <h2>Model Structure</h2>
    <p>Each VLM version follows a three-stage pipeline:</p>
    <ul>
      <li><strong>K1</strong>: Trained from scratch (base model)</li>
      <li><strong>K2</strong>: Fine-tuned on broader/general-purpose data</li>
      <li><strong>K3</strong>: Fine-tuned for clarity and simplicity</li>
    </ul>
    <p>K stands for <em>Knowledge</em>, with higher numbers representing more advanced training stages. <strong>Higher doesn't mean the model has more parameters!</strong></p>
  </div>

  <div class="section">
    <h2>Versions and Training Details</h2>

    <h3>VLM 1</h3>
    <ul>
      <li>Parameters: <code>124M</code></li>
      <li>Training Time: ~4 hours per stage</li>
      <li>Final Loss (all stages): ~<code>3.0</code></li>
      <li><strong>K1</strong>: Trained on <code>tatsu-lab/alpaca</code> and a small custom dataset</li>
      <li><strong>K2</strong>: Fine-tuned K1 on <code>Elriggs/openwebtext-100k</code></li>
      <li><strong>K3</strong>: Fine-tuned K2 on <code>rahular/simple-wikipedia</code></li>
    </ul>

    <h3>VLM 1.1</h3>
    <ul>
      <li>Parameters: <code>355M</code></li>
      <li>Training Time: ~4 hours per stage</li>
      <li>Target Final Loss: ~<code>1.0</code></li>
      <li><strong>K1</strong>: Currently training on <code>------</code> and <code>------</code></li>
    </ul>
  </div>

  <div class="section">
    <h2>Contact & More</h2>
    <p>Developed and maintained by <strong>Vortex Intelligence</strong>.</p>
    <!-- <p>Website: <a href="https://pingvortex.xyz" target="_blank">pingvortex.xyz</a></p> -->
  </div>
</body>
</html>