Alyona Vert

alyona0l

AI & ML interests

None yet

Recent Activity

upvoted an article about 2 months ago

What Coding Agent Wins?

upvoted an article 3 months ago

🦸🏻#17: What is A2A and why is it – still! – underappreciated?

reacted to Kseniase's post with 🔥 3 months ago

10 new Chain-of-Thoughts (CoT) methods CoT has long been one of the hottest techniques in AI thanks to its effectiveness and compelling core idea: encouraging models to solve complex problems through explicit intermediate reasoning steps. But usually researchers modify original CoT approach, finding tips that further improve LLMs' reasoning. That's what we're going to talk about today. Here's a list of 10 latest enhanced CoT approaches: 1. Chain-of-Defensive-Thought -> https://huggingface.co/papers/2504.20769 Provides a few structured, defensive reasoning exemplars to improve the robustness of LLMs 2. Hybrid-CoT -> https://huggingface.co/papers/2504.21659 Proposes using Adaptive Hybrid Reasoning Model (AdaR1) that combines Long- and Short-CoT, and applying bi-level preference training to select effective reasoning styles 3. Semantic-level and token-level CoT -> https://huggingface.co/papers/2505.00703 Introduces T2I-R1 text-to-image gen model, that uses semantic-level CoT for prompt planning and token-level CoT for pixel-level generation, while BiCoT-GRPO coordinates them both 4. Speculative CoT (SCoT) -> https://huggingface.co/papers/2504.19095 SCoT drafts multiple reasoning paths with a lightweight draft, selects the best, and uses the target model for correction - all this to reduce latency by 48–66% 5. Collaborative CoT (Co-CoT) -> https://huggingface.co/papers/2504.17091 Breaks reasoning into blocks that users can inspect, modify and re-run, promoting active engagement. An adaptation mechanism aligns outputs with diverse cognitive styles and user goals 6. XS-CoT -> https://huggingface.co/papers/2504.20835 It's a cross-lingual framework that integrates speech-to-text translation into reasoning, using a semi-implicit CoT approach to compress intermediate tokens. This improves non-core language responses by up to 45% Read further in the comments 👇 If you liked this, also subscribe to the Turing Post -> https://www.turingpost.com/subscribe

View all activity

Organizations

upvoted an article about 2 months ago

Article

What Coding Agent Wins?

and 1 other •

Jun 26

• 7

upvoted an article 3 months ago

Article

🦸🏻#17: What is A2A and why is it – still! – underappreciated?

•

May 7

• 13

reacted to Kseniase's post with 🔥 3 months ago

Post

4264

10 new Chain-of-Thoughts (CoT) methods

CoT has long been one of the hottest techniques in AI thanks to its effectiveness and compelling core idea: encouraging models to solve complex problems through explicit intermediate reasoning steps. But usually researchers modify original CoT approach, finding tips that further improve LLMs' reasoning. That's what we're going to talk about today.

Here's a list of 10 latest enhanced CoT approaches:

1. Chain-of-Defensive-Thought -> Chain-of-Defensive-Thought: Structured Reasoning Elicits Robustness in Large Language Models against Reference Corruption (2504.20769)
Provides a few structured, defensive reasoning exemplars to improve the robustness of LLMs

2. Hybrid-CoT -> AdaR1: From Long-CoT to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization (2504.21659)
Proposes using Adaptive Hybrid Reasoning Model (AdaR1) that combines Long- and Short-CoT, and applying bi-level preference training to select effective reasoning styles

3. Semantic-level and token-level CoT -> T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT (2505.00703)
Introduces T2I-R1 text-to-image gen model, that uses semantic-level CoT for prompt planning and token-level CoT for pixel-level generation, while BiCoT-GRPO coordinates them both

4. Speculative CoT (SCoT) -> Efficient Reasoning for LLMs through Speculative Chain-of-Thought (2504.19095)
SCoT drafts multiple reasoning paths with a lightweight draft, selects the best, and uses the target model for correction - all this to reduce latency by 48–66%

5. Collaborative CoT (Co-CoT) -> Co-CoT: A Prompt-Based Framework for Collaborative Chain-of-Thought Reasoning (2504.17091)
Breaks reasoning into blocks that users can inspect, modify and re-run, promoting active engagement. An adaptation mechanism aligns outputs with diverse cognitive styles and user goals

6. XS-CoT -> Enhancing Non-Core Language Instruction-Following in Speech LLMs via Semi-Implicit Cross-Lingual CoT Reasoning (2504.20835)
It's a cross-lingual framework that integrates speech-to-text translation into reasoning, using a semi-implicit CoT approach to compress intermediate tokens. This improves non-core language responses by up to 45%

Read further in the comments 👇

If you liked this, also subscribe to the Turing Post -> https://www.turingpost.com/subscribe

1 reply

reacted to Kseniase's post with 👀 4 months ago

Post

6575

6 Free resources on Reinforcement Learning (RL)

RL now is where the real action is, it's the engine behind autonomous tech, robots, and the next wave of AI that thinks, moves and solves problems on its own. To stay up to date with what’s happening in RL, we offer some fresh materials on it:

1. "Reinforcement Learning from Human Feedback" by Nathan Lambert -> https://rlhfbook.com/
It's a short introduction to RLHF, explaining instruction tuning, reward modeling, alignment methods, synthetic data, evaluation, and more

2. "A Course in Reinforcement Learning (2nd Edition)" by Dimitri P. Bertsekas -> https://www.mit.edu/~dimitrib/RLbook.html
Explains dynamic programming (DP) and RL, diving into rollout algorithms, neural networks, policy learning, etc. It’s packed with solved exercises and real-world examples

3. "Mathematical Foundations of Reinforcement Learning" video course by Shiyu Zhao -> https://www.youtube.com/playlist?list=PLEhdbSEZZbDaFWPX4gehhwB9vJZJ1DNm8
Offers a mathematical yet friendly introduction to RL, covering Bellman Equation, value iteration, Monte Carlo learning, approximation, policy gradient, actor-critic methods, etc.
+ Check out the repo for more: https://github.com/MathFoundationRL/Book-Mathematical-Foundation-of-Reinforcement-Learning

4. "Multi-Agent Reinforcement Learning" by Stefano V. Albrecht, Filippos Christianos, and Lukas Schäfer -> https://www.marl-book.com/
Covers models, core ideas of multi-agent RL (MARL) and modern approaches to combining it with deep learning

5. "Reinforcement Learning: A Comprehensive Overview" by Kevin P. Murphy -> https://arxiv.org/pdf/2412.05265
Explains RL and sequential decision making, covering value-based, policy-gradient, model-based, multi-agent RL methods, RL+LLMs, and RL+inference and other topics

6. Our collection of free courses and books on RL -> https://huggingface.co/posts/Kseniase/884818121094439

If you liked this, also subscribe to The Turing Post: https://www.turingpost.com/subscribe

upvoted an article 4 months ago

Article

What is MoE 2.0? Update Your Knowledge about Mixture-of-experts

and 1 other •

Apr 27

• 9

published an article 4 months ago

Article

What is MoE 2.0? Update Your Knowledge about Mixture-of-experts

and 1 other •

Apr 27

• 9

upvoted an article 4 months ago

Article

Topic 33: Slim Attention, KArAt, XAttention and Multi-Token Attention Explained – What’s Really Changing in Transformers?

and 1 other •

Apr 4

• 14

published an article 4 months ago

Article

Topic 33: Slim Attention, KArAt, XAttention and Multi-Token Attention Explained – What’s Really Changing in Transformers?

and 1 other •

Apr 4

• 14

upvoted 2 articles 5 months ago

Article

🎙️🧩 TP/Inference: Sharon Zhou on AI Hallucinations, Agents Hype, and Giving Developers the Keys to GenAI

•

Mar 24

• 2

Article

What is Qwen-Agent framework? Inside the Qwen family

and 1 other •

Mar 20

• 12

published an article 5 months ago

Article

What is Qwen-Agent framework? Inside the Qwen family

and 1 other •

Mar 20

• 12

upvoted 2 articles 5 months ago

Article

🌁#92: Fight for Developers and the Year of Orchestration

•

Mar 18

• 5

Article

🦸🏻#14: What Is MCP, and Why Is Everyone – Suddenly!– Talking About It?

•

Mar 17

• 326

reacted to Kseniase's post with 🔥 5 months ago

Post

7984

15 types of attention mechanisms

Attention mechanisms allow models to dynamically focus on specific parts of their input when performing tasks. In our recent article, we discussed Multi-Head Latent Attention (MLA) in detail and now it's time to summarize other existing types of attention.

Here is a list of 15 types of attention mechanisms used in AI models:

1. Soft attention (Deterministic attention) -> Neural Machine Translation by Jointly Learning to Align and Translate (1409.0473)
Assigns a continuous weight distribution over all parts of the input. It produces a weighted sum of the input using attention weights that sum to 1.

2. Hard attention (Stochastic attention) -> Effective Approaches to Attention-based Neural Machine Translation (1508.04025)
Makes a discrete selection of some part of the input to focus on at each step, rather than attending to everything.

3. Self-attention -> Attention Is All You Need (1706.03762)
Each element in the sequence "looks" at other elements and "decides" how much to borrow from each of them for its new representation.

4. Cross-Attention (Encoder-Decoder attention) -> Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation (2104.08771)
The queries come from one sequence and the keys/values come from another sequence. It allows a model to combine information from two different sources.

5. Multi-Head Attention (MHA) -> Attention Is All You Need (1706.03762)
Multiple attention “heads” are run in parallel. The model computes several attention distributions (heads), each with its own set of learned projections of queries, keys, and values.

6. Multi-Head Latent Attention (MLA) -> DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model (2405.04434)
Extends MHA by incorporating a latent space where attention heads can dynamically learn different latent factors or representations.

7. Memory-Based attention -> End-To-End Memory Networks (1503.08895)
Involves an external memory and uses attention to read from and write to this memory.

See other types in the comments 👇

1 reply

upvoted an article 5 months ago

Article

How to Reduce Memory Use in Reasoning Models

and 1 other •

Mar 13

• 14

published an article 5 months ago

Article

How to Reduce Memory Use in Reasoning Models

and 1 other •

Mar 13

• 14

upvoted 2 articles 5 months ago

Article

🦸🏻#13: Action! How AI Agents Execute Tasks with UI and API Tools

•

Mar 10

• 9

Article

🌁#91: We are failing in AI literacy

and 1 other •

Mar 10

• 3

published an article 5 months ago

Article

🌁#91: We are failing in AI literacy

and 1 other •

Mar 10

• 3

upvoted an article 5 months ago

Article

🦸🏻#12: How Do Agents Learn from Their Own Mistakes? The Role of Reflection in AI

•

Mar 9

• 8

Alyona Vert

AI & ML interests

Recent Activity

Organizations

alyona0l's activity

What Coding Agent Wins?

🦸🏻#17: What is A2A and why is it – still! – underappreciated?

What is MoE 2.0? Update Your Knowledge about Mixture-of-experts

What is MoE 2.0? Update Your Knowledge about Mixture-of-experts

Topic 33: Slim Attention, KArAt, XAttention and Multi-Token Attention Explained – What’s Really Changing in Transformers?

Topic 33: Slim Attention, KArAt, XAttention and Multi-Token Attention Explained – What’s Really Changing in Transformers?

🎙️🧩 TP/Inference: Sharon Zhou on AI Hallucinations, Agents Hype, and Giving Developers the Keys to GenAI

What is Qwen-Agent framework? Inside the Qwen family

What is Qwen-Agent framework? Inside the Qwen family

🌁#92: Fight for Developers and the Year of Orchestration

🦸🏻#14: What Is MCP, and Why Is Everyone – Suddenly!– Talking About It?

How to Reduce Memory Use in Reasoning Models

How to Reduce Memory Use in Reasoning Models

🦸🏻#13: Action! How AI Agents Execute Tasks with UI and API Tools

🌁#91: We are failing in AI literacy

🌁#91: We are failing in AI literacy

🦸🏻#12: How Do Agents Learn from Their Own Mistakes? The Role of Reflection in AI