|
--- |
|
title: Neochar |
|
emoji: 🖼 |
|
colorFrom: purple |
|
colorTo: red |
|
sdk: gradio |
|
sdk_version: 5.25.0 |
|
app_file: app.py |
|
pinned: false |
|
license: openrail |
|
short_description: Unwritten Chinese Charecters in Style |
|
--- |
|
|
|
# What is this? |
|
|
|
Generate New Characters by combining parts in creative ways. Write them in a controlled style. |
|
|
|
- Inspired by |
|
- Lin Yutang's [Ming-Kwai typewriter](https://en.wikipedia.org/wiki/Chinese_typewriter#MingKwai_design) |
|
- Wu Yue's [Glyffuser](https://yue-here.com/posts/glyffuser/) |
|
|
|
# Why |
|
|
|
- Fun to generate valid but unseen characters. (Never in a dictionary, nor Unicode). |
|
- Implements Lin Yutang's ideas with generative AI/ML, without the mechanical marvel :-/ or limitations :-) |
|
- Extends a font to support new charsets, and beyond to non-existent chars. |
|
- Adds variation/diversity/personality to generated images. No boring duplicates from the same char. |
|
- Other [Creative Uses](#creative-uses) |
|
|
|
# How to use this app |
|
- Combine components or radicals in the following way |
|
- Specify the 'Structure' and 'Components', in a [Polish Notation](https://en.wikipedia.org/wiki/Polish_notation) fashion - Good for tree structures |
|
- ⿰: 'LR' Left-Rigth |
|
- ⿱: 'TB' Top-Bottom |
|
- ⿸: 'TL' Top-Left |
|
- ⿹: 'TR' Top-Right |
|
- ⿺: 'BL' Bottom-Left |
|
- ⿴: 'OI' Outer-Inner |
|
- ⿻: 'OV' Overlap |
|
- ⿲: 'LMR' Left-Middle-Right |
|
- ⿳: 'TMB' Top-Middle-Bottom |
|
- ⿵: 'BT' Bottom Open Enclosure |
|
- ⿶: 'CT' Top Open Enclosure |
|
- ⿷: 'RT' Right Open Enclosure |
|
- Select a 'Style' by clicking the sample images |
|
- Hit the 'Generate' button |
|
- Repeat |
|
|
|
# Usage Tips |
|
- Simple structures work best (⿰ ⿱ ⿴ etc.) |
|
- "Known radicals at seen positions" work best (釒on left better than right, but may also surprise you in a good way) |
|
- Noto font family (sans and serif) gives the best results, as there are many training examples |
|
- Cursive and handwritten styles usually give good results, as they are more tolerant |
|
- Fonts supporting less chars are challenging |
|
- Current model was trained with 300k samples for only 20 epochs |
|
- Training will continue if this app gets attention or likes |
|
|
|
- For dictionary chars, [decompose](https://github.com/cburgmer/cjklib/blob/master/cjklib/data/characterdecomposition.csv) first. |
|
- For a part hard to describe, or you don't care, use a wildcard '?' (full-width question mark, or does it matter?) |
|
|
|
- What to do when the results are not as expected |
|
- Pick a different 'sytle' which may have trained the model better |
|
- Try again with a different random seed. This will change the overall structure in an unpredictable way |
|
- Try again with a different 'step' number. This will change the local details in a continuous way |
|
|
|
# Creative Uses |
|
## Turning a bug into a feature |
|
When you see a funny result you didn't expect (5 or 3 dots while it should be 4), don't throw it away immediately. |
|
- Save the results to confuse/train OCR |
|
- 3vade 3vil c3nsorship |
|
- Share in discussion. The input text/seed/step will reliably reproduce the result. |
|
|
|
# Future Features |
|
- Typewriter keyboard for hard-to-input radicals, filtered by pinyin prefix |
|
- Direct generation from a single char, auto decomposition |
|
|