-
Cached Transformers: Improving Transformers with Differentiable Memory Cache
Paper • 2312.12742 • Published • 14 -
ProTIP: Progressive Tool Retrieval Improves Planning
Paper • 2312.10332 • Published • 8 -
Paloma: A Benchmark for Evaluating Language Model Fit
Paper • 2312.10523 • Published • 13 -
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale
Paper • 2406.17557 • Published • 98
daje kang
daje
AI & ML interests
None yet
Recent Activity
upvoted
a
collection
15 days ago
Qwen3
published
a dataset
25 days ago
daje/kaggle-image-datasets
updated
a dataset
about 1 month ago
daje/synthetic-ko-sql-hard-add-llm-result
Organizations
None yet
Collections
1
models
39
daje/Meta-Llama-3.1-8B-Instruct-de-identification
Updated
•
5
daje/Qwen2.5-14B-Instruct-tools
Text Generation
•
Updated
•
4
daje/model_0.0002_alpha-32_r-64
Updated
•
244
daje/model_0.0002_alpha-8_r-16
Updated
•
252
daje/model_5e-05_alpha-128_r-256
Updated
•
244
daje/model_2e-4_alpha-8_r-16
Updated
•
259
daje/model_Lora
Updated
•
238
daje/model_2e-4
Updated
•
243
daje/model
Updated
•
248
daje/Qwen2-7B-Instruct-harmful_detector_2000-H100_1
Updated
datasets
17
daje/synthetic-ko-sql-hard-add-llm-result
Viewer
•
Updated
•
1.68k
•
33
daje/synthetic-ko-sql-hard
Viewer
•
Updated
•
1.68k
•
53
•
1
daje/kotext-to-sql-v1-hard
Viewer
•
Updated
•
2k
•
24
daje/kaggle-image-datasets
Viewer
•
Updated
•
44.4k
•
14
daje/de-identify-chat-ko
Viewer
•
Updated
•
9.92k
•
16
daje/ko-hatefulmemes_train_8500
Viewer
•
Updated
•
8.2k
•
22
daje/ko-hatefulmemes_train_8500_kmhas
Viewer
•
Updated
•
95.3k
•
30
daje/ko-hatefulmemes_train_2000
Viewer
•
Updated
•
1.91k
•
23
daje/Ko-SciecneQA
Viewer
•
Updated
•
12.7k
•
16
daje/keyword_summary
Viewer
•
Updated
•
1k
•
45