44 5 187

sometimesanotion

https://ko-fi.com/sometimesanotion

AI & ML interests

Agentic LLM services, model merging, finetunes, distillation

Recent Activity

liked a dataset 3 days ago

XenArcAI/MathX-5M

replied to sequelbox's post 3 days ago

NEW RELEASE: Shining Valiant 3 now available for openai/gpt-oss-20b! - Cutting edge science-reasoning: https://huggingface.co/datasets/sequelbox/Celestia3-DeepSeek-R1-0528 for physics, biology, chemistry, compsci, astronomy, Earth science, and information theory. - AI to build AI: the all-new https://huggingface.co/datasets/sequelbox/Mitakihara-DeepSeek-R1-0528 dataset for high-quality reasoning performance on AI, MLOps, math and CUDA, complex adaptive and agentic systems, cognition, logic, linguistics, simulation, knowledge management, and more! - Creative reasoning and general chat performance supplemented with https://huggingface.co/datasets/sequelbox/Raiden-DeepSeek-R1 Get the new SV3: https://huggingface.co/ValiantLabs/gpt-oss-20b-ShiningValiant3 This is our first release for the new https://huggingface.co/openai/gpt-oss-20b - we're hoping to support this model with more releases going forward. We're also excited to bring our models to https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507 and the other 2507 Qwen 3 models - coming very soon! We want to bring SV3, Esper 3, and our Experimental Reasoning finetunes to more models ASAP. Help us out: https://huggingface.co/spaces/sequelbox/SupportOpenSource Open source matters. Fight for it with us. love, allegra

replied to sequelbox's post 4 days ago

View all activity

Organizations

replied to sequelbox's post 3 days ago

I also have a hypothesis that this model can be efficiently downsized not by pruning experts, but by using merges and LoRAs to downsize their unique parameter count. The merge would be most of the shared parameters, and the routing table need not change.

I'm building up a new version of my pipeline to test this hypothesis. I suspect it'd let us get most of the performance in <12B parameters.

replied to sequelbox's post 4 days ago

This is a very cool release! I really enjoy the ShiningValiant series!

Do you see potential to prune experts or layers from the gpt-oss-20b model to downsize it, and then finetune?

reacted to sequelbox's post with 🔥👍 4 days ago

Post

2107

NEW RELEASE: Shining Valiant 3 now available for openai/gpt-oss-20b!

- Cutting edge science-reasoning: sequelbox/Celestia3-DeepSeek-R1-0528 for physics, biology, chemistry, compsci, astronomy, Earth science, and information theory.
- AI to build AI: the all-new sequelbox/Mitakihara-DeepSeek-R1-0528 dataset for high-quality reasoning performance on AI, MLOps, math and CUDA, complex adaptive and agentic systems, cognition, logic, linguistics, simulation, knowledge management, and more!
- Creative reasoning and general chat performance supplemented with sequelbox/Raiden-DeepSeek-R1

Get the new SV3: ValiantLabs/gpt-oss-20b-ShiningValiant3

This is our first release for the new openai/gpt-oss-20b - we're hoping to support this model with more releases going forward.

We're also excited to bring our models to Qwen/Qwen3-4B-Thinking-2507 and the other 2507 Qwen 3 models - coming very soon!

We want to bring SV3, Esper 3, and our Experimental Reasoning finetunes to more models ASAP. Help us out: sequelbox/SupportOpenSource

Open source matters. Fight for it with us.

love,
allegra

3 replies

reacted to sequelbox's post with 🚀 13 days ago

Post

2286

NEW EXPERIMENTAL RELEASE: DAG Reasoning is here!

- Our first Experimental Reasoning Modality release: create structured, analytical Directed Acyclic Graphs to provide insight into your queries and situations!
- Multi-step analysis identifies causal relationships, produces confidence measurements, and forms a single structured graph object.
- DAG Reasoning Format provides clear, readable JSON containing structured, useful information; easy to use for creating visualizations, doing analysis, or further conversation with your assistant.
- Trained in a variety of subjects for flexible analysis: programming, science, business, economics, finance, law, logistics, management, and more!

Our first DAG Reasoning release is Qwen 3, starting off with 8B and 14B!
Get 8B: sequelbox/Qwen3-8B-DAG-Reasoning
Get 14B: sequelbox/Qwen3-14B-DAG-Reasoning

You can also get the DAG Reasoning dataset, to train your own models to use DAG Reasoning Format: sequelbox/DAG-Reasoning-DeepSeek-R1-0528

Support our experimental open-source research efforts, models and datasets: sequelbox/SupportOpenSource

with love,
allegra

2 replies

reacted to sequelbox's post with 🔥 27 days ago

Post

2602

Some new releases:

- brought the new Shining Valiant 3 series (science-reasoning, AI-reasoning, general chat) to Qwen 3 4B: ValiantLabs/Qwen3-4B-ShiningValiant3
- merged models for Shining Valiant 3 and Esper 3, combining their technical expertise and reasoning skills:
4b: sequelbox/Qwen3-4B-PlumEsper
8b: sequelbox/Qwen3-8B-PlumEsper

coming up we'll have some experimental reasoning releases - datasets and models will be out soon!

also will be bringing SV3 and Esper 3 to more models.

lets keep working for open source :)

love,
allegra

reacted to CultriX's post with 🔥 about 1 month ago

Post

1690

New Space: Generate Knowledge Graphs from input data using LLM's (OpenRouter). It's a trial project but seems to be working alright so far!

CultriX/Generate-Knowledge-Graphs

Below is an example after feeding it the wikipedia page about Elon Musk:

reacted to sequelbox's post with 👍 about 1 month ago

Post

2210

NEW RELEASE: Shining Valiant 3!

- Cutting edge science-reasoning: sequelbox/Celestia3-DeepSeek-R1-0528 for physics, biology, chemistry, compsci, astronomy, Earth science, and information theory
- AI to build AI: the all-new sequelbox/Mitakihara-DeepSeek-R1-0528 dataset for high-quality reasoning performance on AI, MLOps, math and CUDA, complex adaptive and agentic systems, cognition, logic, linguistics, simulation, knowledge management, and more!
- Creative reasoning and general chat performance supplemented with sequelbox/Raiden-DeepSeek-R1

Our first release in the SV3 series is Qwen 3, starting off with 8B and 1.7B.
Get 8B: ValiantLabs/Qwen3-8B-ShiningValiant3
Get 1.7B: ValiantLabs/Qwen3-1.7B-ShiningValiant3

We want to bring SV3 to larger models ASAP. Help us out: sequelbox/SupportOpenSource

This is the most excited we've ever been for a release. We hope you enjoy Shining Valiant 3 as much as we do!

With friendship, for the future,
allegra

reacted to sequelbox's post with 🚀🔥 about 1 month ago

Post

1849

The full Celestia 3 science-reasoning dataset is here!

- 91k high-quality synthetic science prompts answered by DeepSeek-R1-0528
- subjects include physics, biology, chemistry, computer science, Earth science, astronomy, and information theory
- one of the reasoning datasets powering the upcoming Shining Valiant 3 :) coming soon!

GET IT NOW, FOR EVERYONE: sequelbox/Celestia3-DeepSeek-R1-0528
SUPPORT OUR RELEASES: sequelbox/SupportOpenSource

with love,
allegra

replied to CultriX's post about 2 months ago

The premise of getting things structured as soon as possible is quite sound. My reaction to CCL is - can you be agnostic to the many different ways things are processed? What is the most essential way that data and processes relate to each other for what you're doing?

reacted to sequelbox's post with 🔥 about 2 months ago

Post

1068

a list of what's coming up soon from us:

- Shining Valiant 3 for Valiant Labs, powered by the full size Celestia 3 and other soon to be released high-difficulty reasoning datasets
- a new type of reasoning model and dataset we're very excited about - would love to bring out an alpha release here as soon as possible
- more model releases for Esper 3 (weigh in if there are any models you'd like us to prioritize!)
- other New Things

not sure of the exact release order yet, but we'll look to get everything out as quick as we can :)

with excitement,
allegra

reacted to merve's post with 🔥 2 months ago

Post

2930

Qwen2.5-Omni is soooo good that people build multimodal reasoning models off of it 🥹
> KE-Team/Ke-Omni-R-3B is open-source audio reasoning model sota on average of benchmarks, based on Qwen/Qwen2.5-Omni-3B 🗣️
> Haoz0206/Omni-R1 is a video reasoning model with pixel level grounding (see below) and it's super competitive ⏯️ based on Qwen/Qwen2.5-Omni-7B

reacted to sequelbox's post with 🔥 2 months ago

Post

1104

EARLY SNEAK PREVIEW: get a first look at the Celestia 3 science-reasoning dataset, built with DeepSeek's newest R1-0528 reasoning model! Subjects include physics, chemistry, biology, computer science, Earth science, astronomy, and information theory.

This early look contains the first 14k rows, all synthetic responses using deepseek-ai/DeepSeek-R1-0528

SEE IT HERE: sequelbox/Celestia3-DeepSeek-R1-0528-PREVIEW

Support our releases: sequelbox/SupportOpenSource

Coming up we'll have more dataset releases, including some novel reasoning and analysis methods - we think an important role for open source researchers is experimenting with new response styles on top of the increasingly excellent base models available to finetune.

more to come soon!
allegra

replied to CultriX's post 2 months ago

Now imagine this as a hashtag generator and so a RAG search can find great context. :)

replied to CultriX's post 2 months ago

Neat! I've transitioned from wanting more from a model's one-shot answers to breaking things down and walking through the problem with cached context. This effectively means simulating most of the thinking block, but by tool usage and RAG.

I'm happily using our models from months ago to do it. If anything - even Lamarck 0.7's use of thinking blocks are a bit much. I'm using Lamarck 0.7 Fusion (my best GPQA model, though it didn't break your record and is best used where modest IFEVAL isn't a blocker) and /nothink with ValiantLab's Qwen3 models in concert.

I suspect I'll try some merges soon to give this toolchain better models, leaderboard or no leaderboard!

replied to sequelbox's post 2 months ago

I've been using Esper3 8B and 14B for first-pass code review. I am quite pleased.

Have you considered fine-tuning a 1.7B or smaller model for autocomplete?

replied to CultriX's post 2 months ago

I've been thinking a lot about using small caches of embeddings for local RAG lately. Have you considered an HTTP caching proxy like Squid as a low-impact source? It would retrieve what a user is reading anyway, and what's in their field of interest. A browser extension to signal some limited ingestion when a page is bookmarked might fit a lot of use cases.

For many reasons, smart management of context windows is my top priority with AI now!

posted an update 3 months ago

Post

1885

The capabilities of the new Qwen 3 models are fascinating, and I am watching that space!

My experience, however, is that context management is vastly more important with them. If you use a client with a typical session log with rolling compression, a Qwen 3 model will start to generate the same messages over and over. I don't think that detracts from them. They're optimized for a more advanced MCP environment. I honestly think the 8B is optimal for home use, given proper RAG/CAG.

In typical session chats, Lamarck and Chocolatine are still my daily drives. I worked hard to give Lamarck v0.7 a sprinkling of CoT from both DRT and Deepseek R1. While those models got surpassed on the leaderboards, in practice, I still really enjoy their output.

My projects are focusing on application and context management, because that's where the payoff in improved quality is right now. But should there be a mix of finetunes to make just the right mix of - my recipes are standing by.

replied to their post 5 months ago

@Inschrift-Spruch-Raum ,I am looking through recent PRs to mergekit, and I am optimistic that Lamarck's recipes will be working again soon!

When that happens, there will be two efforts: one to make a compelling non-CoT model, and another to blend CoT in right amounts.

Lamarck's multilingual capabilities improved noticeably from light influence of Krystalan/DRT-14B in v0.6, and merging from other CoT models like DeepSeek R1 is a matter of careful moderation. I will always put the overall apparent quality of translation, prose, and reasoning first.

sometimesanotion

AI & ML interests

Recent Activity

Organizations

sometimesanotion's activity