Konthee commited on
Commit
4e789c0
·
verified ·
1 Parent(s): 30359ae

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -0
README.md CHANGED
@@ -7,6 +7,18 @@ tags:
7
  ---
8
  <br>
9
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  ## How to use
11
  - #### Install python package
12
  ```python
@@ -148,3 +160,6 @@ recall_text_search = sum(1.0 if i in indices else 0.0
148
 
149
  ### Authors
150
  * Konthee Boonmeeprakob ([email protected])
 
 
 
 
7
  ---
8
  <br>
9
 
10
+ ### Introduction
11
+ The foundational technology for generative prompt models are Language-Image pretraining models such as CLIP (Contrastive Language-Image Pre-Training) which aligned Language-Image latent of image and text encoder. We can apply latent vector for zero-short classification and image searching. For generative prompt models, we can train generative model using frozen image encoder and then replace image encoder with text encoder to be a prompt of generative model in the inference pipeline.
12
+
13
+ **Scope of work**
14
+
15
+ From limited of computing resources, datasets, engineers we purpose to train CLIP model with 2 stage training of CLIP model
16
+ - **Stage 1:** Language encoder distillation training
17
+ We will train Thai (or Bilingual EN-TH) text encoder with original CLIP encoder following Multilingual-CLIP using EN-EN, EN-TH text pairs of machine translation datasets.
18
+ - **Stage 2:** Continue CLIP pretraining with frozen image encoder
19
+ Distillation training model may not understand all of token especially specific words. We have to continue CLIP (or LiT, or SigLiT) pretraining with frozen image encoder to learn details of specific words.
20
+ After we have our own CLIP model we will replace CLIP application text encoder with our own text encoder or we may finetuning application model to push performance of our model.
21
+
22
  ## How to use
23
  - #### Install python package
24
  ```python
 
160
 
161
  ### Authors
162
  * Konthee Boonmeeprakob ([email protected])
163
+
164
+
165
+ <br>