QJerry commited on
Commit
56cdba2
·
verified ·
1 Parent(s): 029dfa7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -23
README.md CHANGED
@@ -2,37 +2,51 @@
2
  library_name: transformers
3
  license: mit
4
  datasets:
5
- - X-ART/LeX-R1-60K
6
  base_model:
7
- - deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
8
  pipeline_tag: text-generation
9
  tags:
10
- - art
11
- - text-rendering
12
  ---
13
- **LeX-Enhancer** is a lightweight prompt enhancement model distilled from DeepSeek-R1. Specifically, we collect **60,856 prompt pairs** before and after DeepSeek-R1 enhancement, and fine-tune a Deepseek-R1-Distilled-Qwen-14B model using LoRA to replicate the detailed prompting capabilities of R1. This enables efficient, large-scale generation of high-quality, visually grounded prompts.
14
 
15
- > **Input ~ What you would like to paint in the text prompt:**
16
- A thank you card with the words very much, with the text on it: \"VERY\" in black, \"MUCH\" in yellow.
17
 
18
- > **Output ~ Enhancer text prompt:**
19
- The image features a rectangular thank you card with a muted, warm-toned background in a soft gradient of creamy beige and blush pink, providing a subtle textured finish that adds depth without overwhelming the central elements. Centered near the top of the card, the word "VERY" is rendered in thick, uppercase black lettering with a clean sans-serif font, slightly curved to follow the natural flow of the composition. Directly beneath it, the word "MUCH" appears in a vibrant, sunny-yellow hue, matching the same bold font style but slightly larger in size, positioned to align neatly with the lower edge of "VERY," creating a balanced hierarchy. Both texts are outlined with faint drop shadows, enhancing readability against the softly lit backdrop. The cards’ edges are crisp, with minimalistic borders in a lighter shade of the background tones, and the overall layout is symmetrical, exuding a polished, contemporary aesthetic. Gentle ambient lighting casts soft, diffused shadows around the card’s corners, suggesting a lightly textured surface underneath, while the absence of decorative embellishments keeps the focus on the typography. The color palette harmonizes warmth and neutrality, ensuring the text remains the focal point while maintaining a serene, approachable ambiance.
20
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
 
22
- Use this code for inference:
23
  ```python
24
  import torch, os
25
  from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer
26
 
 
27
  SYSTEM_TEMPLATE = (
28
  "A conversation between User and Assistant. The user asks a question, and the Assistant solves it. "
29
  "The assistant first thinks about the reasoning process in the mind and then provides the user with the answer. "
30
  "The reasoning process and answer are enclosed within <think> </think> and <answer> </answer> tags, respectively, i.e., "
31
  "<think> reasoning process here </think> <answer> answer here </answer>."
32
  )
 
33
  model_path = 'X-ART/LeX-Enhancer'
34
 
35
- # Change to what you want to draw in next line
36
  simple_caption = "A thank you card with the words very much, with the text on it: \"VERY\" in black, \"MUCH\" in yellow."
37
 
38
  def create_chat_template(user_prompt):
@@ -43,7 +57,7 @@ def create_chat_template(user_prompt):
43
  ]
44
 
45
  def create_direct_template(user_prompt):
46
- return user_prompt + "<think>" # better
47
 
48
  def create_user_prompt(simple_caption):
49
  return (
@@ -58,20 +72,21 @@ def create_user_prompt(simple_caption):
58
  "6. Avoid using vague expressions such as \"may be\" or \"might be\"; the generated caption must be in a definitive, narrative tone. "
59
  "7. Do not use negative sentence structures, such as \"there is nothing in the image,\" etc. The entire caption should directly describe the content of the image. "
60
  "8. The entire output should be limited to 200 words.\n\n"
61
- "SIMPLE CAPTION: {0}"
62
- ).format(simple_caption)
63
-
 
64
  tokenizer = AutoTokenizer.from_pretrained(model_path)
65
  model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto", torch_dtype=torch.bfloat16)
66
 
67
- # Tokenize the input prompt
68
- messages = create_direct_template(create_user_prompt(simple_caption)) # 3.for direct template
69
- input_ids = tokenizer.encode(messages, return_tensors="pt")
70
 
71
- # Generate text using the model
72
  streamer = TextStreamer(tokenizer, skip_special_tokens=True, clean_up_tokenization_spaces=True)
73
  output = model.generate(
74
- input_ids.to(model.device),
75
  max_length=2048,
76
  num_return_sequences=1,
77
  do_sample=True,
@@ -80,7 +95,5 @@ output = model.generate(
80
  streamer=streamer
81
  )
82
 
83
- # Print the generated text
84
  print("*" * 80)
85
- # print(generated_text)
86
- ```
 
2
  library_name: transformers
3
  license: mit
4
  datasets:
5
+ - X-ART/LeX-R1-60K
6
  base_model:
7
+ - deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
8
  pipeline_tag: text-generation
9
  tags:
10
+ - art
11
+ - text-rendering
12
  ---
 
13
 
14
+ # 🎨 LeX-Enhancer: Visual Prompt Generator
 
15
 
16
+ **LeX-Enhancer** is a lightweight **prompt enhancement model** distilled from [DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B).
17
+ We collected **60,856 caption pairs** before and after DeepSeek-R1 enhancement and fine-tuned a `Distilled-Qwen-14B` model using **LoRA** to reproduce high-quality, richly visualized prompt outputs.
18
 
19
+ This enables **efficient, large-scale generation of visually grounded prompts**, perfect for high-fidelity text-to-image generation.
20
+
21
+ ---
22
+
23
+ ## ✍️ Example: From Simple to Enhanced Caption
24
+
25
+ > **🧾 Input (Simple Caption):**
26
+ > A thank you card with the words very much, with the text on it: "VERY" in black, "MUCH" in yellow.
27
+
28
+ > **🪄 Output (Enhanced Caption):**
29
+ > The image features a rectangular thank you card with a muted, warm-toned background in a soft gradient of creamy beige and blush pink, providing a subtle textured finish that adds depth without overwhelming the central elements. Centered near the top of the card, the word "VERY" is rendered in thick, uppercase black lettering with a clean sans-serif font, slightly curved to follow the natural flow of the composition. Directly beneath it, the word "MUCH" appears in a vibrant, sunny-yellow hue, matching the same bold font style but slightly larger in size, positioned to align neatly with the lower edge of "VERY," creating a balanced hierarchy. Both texts are outlined with faint drop shadows, enhancing readability against the softly lit backdrop. The cards’ edges are crisp, with minimalistic borders in a lighter shade of the background tones, and the overall layout is symmetrical, exuding a polished, contemporary aesthetic. Gentle ambient lighting casts soft, diffused shadows around the card’s corners, suggesting a lightly textured surface underneath, while the absence of decorative embellishments keeps the focus on the typography. The color palette harmonizes warmth and neutrality, ensuring the text remains the focal point while maintaining a serene, approachable ambiance.
30
+
31
+ ---
32
+
33
+ ## 🚀 Usage (Python Code)
34
 
 
35
  ```python
36
  import torch, os
37
  from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer
38
 
39
+ # System instruction for reasoning + answering
40
  SYSTEM_TEMPLATE = (
41
  "A conversation between User and Assistant. The user asks a question, and the Assistant solves it. "
42
  "The assistant first thinks about the reasoning process in the mind and then provides the user with the answer. "
43
  "The reasoning process and answer are enclosed within <think> </think> and <answer> </answer> tags, respectively, i.e., "
44
  "<think> reasoning process here </think> <answer> answer here </answer>."
45
  )
46
+
47
  model_path = 'X-ART/LeX-Enhancer'
48
 
49
+ # Your simple caption goes here
50
  simple_caption = "A thank you card with the words very much, with the text on it: \"VERY\" in black, \"MUCH\" in yellow."
51
 
52
  def create_chat_template(user_prompt):
 
57
  ]
58
 
59
  def create_direct_template(user_prompt):
60
+ return user_prompt + "<think>"
61
 
62
  def create_user_prompt(simple_caption):
63
  return (
 
72
  "6. Avoid using vague expressions such as \"may be\" or \"might be\"; the generated caption must be in a definitive, narrative tone. "
73
  "7. Do not use negative sentence structures, such as \"there is nothing in the image,\" etc. The entire caption should directly describe the content of the image. "
74
  "8. The entire output should be limited to 200 words.\n\n"
75
+ f"SIMPLE CAPTION: {simple_caption}"
76
+ )
77
+
78
+ # Load model and tokenizer
79
  tokenizer = AutoTokenizer.from_pretrained(model_path)
80
  model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto", torch_dtype=torch.bfloat16)
81
 
82
+ # Prepare input prompt
83
+ messages = create_direct_template(create_user_prompt(simple_caption))
84
+ input_ids = tokenizer.encode(messages, return_tensors="pt").to(model.device)
85
 
86
+ # Stream output
87
  streamer = TextStreamer(tokenizer, skip_special_tokens=True, clean_up_tokenization_spaces=True)
88
  output = model.generate(
89
+ input_ids,
90
  max_length=2048,
91
  num_return_sequences=1,
92
  do_sample=True,
 
95
  streamer=streamer
96
  )
97
 
 
98
  print("*" * 80)
99
+ # Output will stream via TextStreamer