MedIT Solutions

company

Verified

https://meditsolutions.pl

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

mkurman updated a model 23 days ago

meditsolutions/NeuroBLAST-1.9B-Instruct-Early-Preview

mkurman updated a collection 23 days ago

NeuroBLAST

mkurman published a model 23 days ago

meditsolutions/NeuroBLAST-1.9B-Instruct-Early-Preview

View all activity

mkurman

posted an update 22 days ago

Post

258

🚀 Big news! NeuroBLAST, the outstanding new architecture, has officially arrived on HF! After three intense months of training my 1.9 billion SLM on my trusty RTX 3090 Ti, I’m happy to announce the results. While it’s not perfect just yet, I’ve dedicated countless hours to optimizing costs while crafting clever layer connections that mimic the brain's centers. Plus, I’ve introduced a new memory-like layer that’s sure to turn heads! I can’t wait to dive deep into this journey in my upcoming blog post. Stay tuned for the full scoop! 🔥

meditsolutions/NeuroBLAST-1.9B-Instruct-Early-Preview

mkurman

updated a model 23 days ago

meditsolutions/NeuroBLAST-1.9B-Instruct-Early-Preview

2B • Updated 23 days ago • 45 • 3

mkurman

updated a collection 23 days ago

NeuroBLAST

Collection

Brain-Like Architecture with Stacked Transformers • 1 item • Updated 23 days ago • 1

mkurman

published a model 23 days ago

meditsolutions/NeuroBLAST-1.9B-Instruct-Early-Preview

2B • Updated 23 days ago • 45 • 3

mkurman

updated a model 29 days ago

meditsolutions/medit-one-140M-9B-tokens-checkpoint

0.1B • Updated 29 days ago • 6 • 1

mkurman

posted an update 5 months ago

Post

694

I feel like it's going to take me forever

meditsolutions/medit-one-140M-9B-tokens-checkpoint

mkurman

updated a collection 5 months ago

MedIT One

Collection

A compilation of MedIT One checkpoints • 2 items • Updated Mar 8

mkurman

in meditsolutions/medit-one-140M-9B-tokens-checkpoint 5 months ago

Question on meaning of parameter of this model

#2 opened 5 months ago by

JLouisBiz

mkurman

posted an update 5 months ago

Post

937

Just released NVAMP Loss!

✔️ modification of the cross-entropy loss function designed specifically for training LLMs.
✔️ twist on the standard cross-entropy loss by emphasizing the importance of outlier prediction errors and dynamically normalizing token-level variance.
✔️ more stable and efficient training, leading to models that generalize better.

Check it out, give it a spin, and let me know what you think!

Licensed under the Apache 2.0 license and ready to use. Happy training! 🔥🤖

https://github.com/mkurman/nvamp-loss

mkurman

in meditsolutions/medit-one-140M-9B-tokens-checkpoint 5 months ago

Can't install

#1 opened 5 months ago by

JLouisBiz

mkurman

posted an update 5 months ago

Post

2411

MedIT One 140M Fifth checkpoint after 9B tokens
meditsolutions/medit-one-140M-9B-tokens-checkpoint

mkurman

published a model 5 months ago

meditsolutions/medit-one-140M-9B-tokens-checkpoint

0.1B • Updated 29 days ago • 6 • 1

mkurman

posted an update 5 months ago

Post

437

Test-time compute (TTC) scaling’s dope. Here’s my spin: Adaptive train-time compute scaling.

https://open.substack.com/pub/mkurman/p/adaptive-train-time-compute-scaling?r=7bzqr

What’s your take? Hit me!

mkurman

posted an update 5 months ago

Post

572

I have uploaded the third pre-training checkpoint after 6 billion tokens to demonstrate that the MedIT One architecture is trainable.

Give it some noise plz! Love u all :D

meditsolutions/medit-one-140M-6B-tokens-checkpoint

mkurman

updated a model 5 months ago

meditsolutions/medit-one-140M-6B-tokens-checkpoint

0.1B • Updated Mar 3 • 4 • 1

mkurman

published a model 5 months ago

meditsolutions/medit-one-140M-6B-tokens-checkpoint

0.1B • Updated Mar 3 • 4 • 1

mkurman

posted an update 6 months ago

Post

3711

Introducing a new architecture, MedIT One – a single-token transformer with LSTM-like recurrence.

It is extremely fast in training and inference, but we lack funding for large-scale training. Enjoy 🍓

https://github.com/MedITSolutionsKurman/medit-one

mkurman

posted an update 6 months ago

Post

2048

I've been working on something cool: a GRPO with an LLM evaluator that can also perform SFT on the feedback data - if you want. Check it out 😊

Any 🌟are more than welcome 🤗

https://github.com/mkurman/grpo-llm-evaluator

mkurman

posted an update 6 months ago

Post

1595

Blurred-Thoughts Supervised-Finetuning 🙈

After hours of working with GitHub Copilot to organize the code, I'm keen to announce the release of Blurred Thoughts Supervised-Finetuning (BT-SFT), a new method for fine-tuning LLMs to produce more diverse and creative responses.

BT-SFT introduces:
✅ Smart tokenization method randomly masks tokens within <think> ... </think> tags, promoting the model to generate diverse responses that align better with its probability distribution instead of memorizing the thought process from distilled data.
✅ Reward function that ensures responses are well-structured.

Explore and contribute to the project available in my GitHub repository:
https://github.com/mkurman/blurred-thoughts-SFT

Keep me updated on your experiments with BT-SFT! 🐐

AI & ML interests

Recent Activity

Team members 2

meditsolutions's activity

Question on meaning of parameter of this model

Can't install