AI & ML interests

Semantic Search, Language models, Domain adaptation, Question Answering

Recent Activity

anakin87ย  updated a Space about 2 months ago
deepset/autoquizzer
anakin87ย  new activity 3 months ago
deepset/roberta-base-squad2:robert
View all activity

anakin87ย 
posted an update 1 day ago
view post
Post
207
๐Ÿ•ต๏ธ๐ŸŒ Building Browser Agents - notebook

No API? No problem.
Browser Agents can use websites like you do: click, type, wait, read.

๐Ÿ““ Step-by-step notebook: https://colab.research.google.com/github/deepset-ai/haystack-cookbook/blob/main/notebooks/browser_agents.ipynb

๐ŸŽฅ In the video, the Agent:
- Goes to Hugging Face Spaces
- Finds black-forest-labs/FLUX.1-schnell
- Expands a short prompt ("my holiday on Lake Como") into a detailed image generation prompt
- Waits for the image
- Returns the image URL


## What else can it do?
Great for information gathering and summarization

๐Ÿ—ž๏ธ๐Ÿ—ž๏ธ Compare news websites and create a table of shared stories with links
โ–ถ๏ธ Find content creator social profiles from YouTube videos
๐Ÿ›๏ธ Find a product's price range on Amazon
๐Ÿš‚ ๐ŸšŒ Gather public transportation travel options


## How is it built?
๐Ÿ—๏ธ Haystack โ†’ Agent execution logic
๐Ÿง  Google Gemini 2.5 Flash โ†’ Good and fast LLM with a generous free tier
๐Ÿ› ๏ธ Playwright MCP server โ†’ Browser automation tools: navigate, click, type, wait...

Even without vision capabilities, this setup can get quite far.


## Next steps
- Try a local open model
- Move from notebook to real deployment
- Incorporate vision

And you? Have you built something similar? What's in your stack?

anakin87ย 
posted an update 7 days ago
view post
Post
1035
Haystack can now see ๐Ÿ‘€

The latest release of the Haystack OSS LLM framework adds a long-requested feature: image support!

๐Ÿ““ Notebooks below

This isn't just about passing images to an LLM. We built several features to enable practical multimodal use cases.

What's new?
๐Ÿง  Support for multiple LLM providers: OpenAI, Amazon Bedrock, Google Gemini, Mistral, NVIDIA, OpenRouter, Ollama and more (support for Hugging Face API coming ๐Ÿ”œ)
๐ŸŽ›๏ธ Prompt template language to handle structured inputs, including images
๐Ÿ“„ PDF and image converters
๐Ÿ” Image embedders using CLIP-like models
๐Ÿงพ LLM-based extractor to pull text from images
๐Ÿงฉ Components to build multimodal RAG pipelines and Agents


I had the chance of leading this effort with @sjrhuschlee (great collab).

๐Ÿ““ Below you can find two notebooks to explore the new features:
๓ ฏโ€ข๓ ๓  Introduction to Multimodal Text Generation https://haystack.deepset.ai/cookbook/multimodal_intro
๓ ฏโ€ข๓ ๓  Creating Vision+Text RAG Pipelines https://haystack.deepset.ai/tutorials/46_multimodal_rag

(๐Ÿ–ผ๏ธ image by @bilgeyucel )
anakin87ย 
posted an update about 1 month ago
view post
Post
428
๐Ÿ›ก๏ธ AI Guardrails with Open Language Models - Tutorial

๐Ÿ““ https://haystack.deepset.ai/cookbook/safety_moderation_open_lms

How do you ensure your AI application is safe from harmful or inappropriate user inputs?

This is a core requirement for real-world AI deployments. Luckily, several open Language Models are built specifically for safety moderation.

I've been exploring them and put together a hands-on tutorial using the Haystack framework to build your own AI guardrails.

In the notebook, you'll learn how to use and customize:
๐Ÿ”น Meta Llama Guard (via Hugging Face API)
๐Ÿ”น IBM Granite Guardian (via Ollama), which can also evaluate RAG specific risk dimensions
๐Ÿ”น Google ShieldGemma (via Ollama)
๐Ÿ”น Nvidia NemoGuard models family, including a model for topic control

You'll also see how to integrate content moderation into a ๐Ÿ”Ž RAG pipeline.
anakin87ย 
posted an update about 2 months ago
view post
Post
1154
๐Ÿงฐ Free up space on the Hub with super_squash_history ๐Ÿงน

As you may know, Hugging Face Hub has storage limits on private repos (100 GB for free users, 1 TB for PROs).

This weekend I did some cleanup on my private repos
I went 1.58 TB down to 1 GB. ๐Ÿ˜…

Besides deleting old, unused models, the main tool I used was a lesser-known command:
super_squash_history.

When you train a model, you often push multiple checkpoints to the Hub.
Each checkpoint = a commit.
A 2.6B model in BF16 is ~5 GB.
So 10 checkpoints = 50 GB. That adds up fast.

While full commit history can be useful for rollbacks, it's often unnecessary for older experiments where only the final model matters.

In these cases, you can use super_squash_history: it reduces your entire repo history to a single commit.

https://huggingface.co/docs/huggingface_hub/main/en/package_reference/hf_api#huggingface_hub.HfApi.super_squash_history

โš ๏ธ super_squash_history is a non-revertible operation. Once squashed, the commit history cannot be retrieved.

Hope this is useful to others.
  • 2 replies
ยท

robert

#30 opened 3 months ago by
arvindparihar
anakin87ย 
posted an update 4 months ago
view post
Post
3498
๐—œ ๐˜๐—ฟ๐—ฎ๐—ถ๐—ป๐—ฒ๐—ฑ ๐—ฎ ๐—Ÿ๐—ฎ๐—ป๐—ด๐˜‚๐—ฎ๐—ด๐—ฒ ๐— ๐—ผ๐—ฑ๐—ฒ๐—น ๐˜๐—ผ ๐˜€๐—ฐ๐—ต๐—ฒ๐—ฑ๐˜‚๐—น๐—ฒ ๐—ฒ๐˜ƒ๐—ฒ๐—ป๐˜๐˜€ ๐˜„๐—ถ๐˜๐—ต ๐—š๐—ฅ๐—ฃ๐—ข! ๐Ÿ‘‘ ๐Ÿ—“๏ธ

โœ๏ธ Blog post: https://huggingface.co/blog/anakin87/qwen-scheduler-grpo

I experimented with GRPO lately.

I am fascinated by models learning from prompts and rewards - no example answers needed like in Supervised Fine-Tuning.

After the DeepSeek boom, everyone is trying GRPO with GSM8K or the Countdown Game...

I wanted a different challenge, like ๐˜๐—ฒ๐—ฎ๐—ฐ๐—ต๐—ถ๐—ป๐—ด ๐—ฎ ๐—บ๐—ผ๐—ฑ๐—ฒ๐—น ๐˜๐—ผ ๐—ฐ๐—ฟ๐—ฒ๐—ฎ๐˜๐—ฒ ๐—ฎ ๐˜€๐—ฐ๐—ต๐—ฒ๐—ฑ๐˜‚๐—น๐—ฒ ๐—ณ๐—ฟ๐—ผ๐—บ ๐—ฎ ๐—น๐—ถ๐˜€๐˜ ๐—ผ๐—ณ ๐—ฒ๐˜ƒ๐—ฒ๐—ป๐˜๐˜€ ๐—ฎ๐—ป๐—ฑ ๐—ฝ๐—ฟ๐—ถ๐—ผ๐—ฟ๐—ถ๐˜๐—ถ๐—ฒ๐˜€.

Choosing an original problem forced me to:
๐Ÿค” Think about the problem setting
๐Ÿงฌ Generate data
๐Ÿค Choose the right base model
๐Ÿ† Design reward functions (and experiencing reward hacking)
๐Ÿ”„ Run multiple rounds of training, hoping that my model would learn something.

A fun and rewarding ๐Ÿ˜„ experience.


I learned a lot of things, that I want to share with you. ๐Ÿ‘‡
โœ๏ธ Blog post: https://huggingface.co/blog/anakin87/qwen-scheduler-grpo
๐Ÿ’ป Code: https://github.com/anakin87/qwen-scheduler-grpo
๐Ÿค— Hugging Face collection (dataset and model): anakin87/qwen-scheduler-grpo-680bcc583e817390525a8837
  • 2 replies
ยท
anakin87ย 
posted an update 7 months ago
view post
Post
1749
๐๐ž๐ฐ ๐ˆ๐ญ๐š๐ฅ๐ข๐š๐ง ๐’๐ฆ๐š๐ฅ๐ฅ ๐‹๐š๐ง๐ ๐ฎ๐š๐ ๐ž ๐Œ๐จ๐๐ž๐ฅ๐ฌ: ๐†๐ž๐ฆ๐ฆ๐š ๐๐ž๐จ๐ ๐ž๐ง๐ž๐ฌ๐ข๐ฌ ๐œ๐จ๐ฅ๐ฅ๐ž๐œ๐ญ๐ข๐จ๐ง ๐Ÿ’Ž๐ŸŒ๐Ÿ‡ฎ๐Ÿ‡น

I am happy to release two new language models for the Italian Language!

๐Ÿ’ช Gemma 2 9B Neogenesis ITA
anakin87/gemma-2-9b-neogenesis-ita
Building on the impressive work by VAGO Solutions, I applied Direct Preference Optimization with a mix of Italian and English data.
Using Spectrum, I trained 20% of model layers.

๐Ÿ“Š Evaluated on the Open ITA LLM leaderboard ( mii-llm/open_ita_llm_leaderboard), this model achieves strong performance.
To beat it on this benchmark, you'd need a 27B model ๐Ÿ˜Ž


๐Ÿค Gemma 2 2B Neogenesis ITA
anakin87/gemma-2-2b-neogenesis-ita
This smaller variant is fine-tuned from the original Gemma 2 2B it by Google.
Through a combination of Supervised Fine-Tuning and Direct Preference Optimization, I trained 25% of the layers using Spectrum.

๐Ÿ“ˆ Compared to the original model, it shows improved Italian proficiency, good for its small size.


Both models were developed during the recent #gemma competition on Kaggle.
๐Ÿ““ Training code: https://www.kaggle.com/code/anakin87/post-training-gemma-for-italian-and-beyond


๐Ÿ™ Thanks @FinancialSupport and mii-llm for the help during evaluation.
ยท
anakin87ย 
posted an update 7 months ago
view post
Post
573
Hey, it has been a while... I was busy participating in ๐Ÿ’Ž ๐†๐ž๐ฆ๐ฆ๐š ๐œ๐จ๐ฆ๐ฉ๐ž๐ญ๐ข๐ญ๐ข๐จ๐ง!

Here's the idea: Gemma open models have a large vocabulary size (256K), so improving them for a specific language or cultural context should be pretty affordable - no need for continued pre-training.

My submission: ๐Ÿ’Ž๐ŸŒ๐Ÿ‡ฎ๐Ÿ‡น ๐๐ž๐จ๐ ๐ž๐ง๐ž๐ฌ๐ข๐ฌ - ๐๐จ๐ฌ๐ญ-๐“๐ซ๐š๐ข๐ง๐ข๐ง๐  ๐†๐ž๐ฆ๐ฆ๐š ๐Ÿ๐จ๐ซ ๐ˆ๐ญ๐š๐ฅ๐ข๐š๐ง ๐š๐ง๐ ๐›๐ž๐ฒ๐จ๐ง๐
๐Ÿ““ Kaggle notebook: https://www.kaggle.com/code/anakin87/post-training-gemma-for-italian-and-beyond

In this notebook, I show how I improve the performance of Gemma 2 2B on Italian via Post-Training.
I believe this method is adaptable to other languages and model sizes.

๐˜’๐˜ฆ๐˜บ ๐˜š๐˜ต๐˜ฆ๐˜ฑ๐˜ด
๐Ÿ“Š Choose reference metrics
๐Ÿง‘โ€๐Ÿ”ฌ Data curation for Instruction Fine Tuning: identify existing datasets + generate synthetic data
๐Ÿ‹๏ธโ€โ™‚๏ธ Efficient Instruction Fine Tuning with Spectrum
๐Ÿง‘โ€๐Ÿ”ฌ Data curation for Preference Tuning: identify existing datasets + generate synthetic data
๐Ÿ‘๐Ÿ‘Ž Efficient Direct Preference Optimization with Spectrum
๐Ÿ“ˆ Evaluation


๐Ÿค— Hugging Face collection (with models and datasets): anakin87/gemma-neogenesis-67824b7bf13ac9cfe091fe2e

I'm also planning a ๐ŸŽ Gemma Giveaway (on LinkedIn - https://www.linkedin.com/in/stefano-fiorucci) in the next few days - sharing techniques, datasets, and models I used for my project... so stay tuned! ๐Ÿ“ป
anakin87ย 
posted an update 8 months ago
view post
Post
1673
Tulu 3 SFT Mixture by AllenAI is a massive, good, multilingual dataset for fine-tuning Language Models.

Unfortunately, it was missing the "language" column.

I added it using the good old fastText.

Check out the dataset here ๐Ÿ‘‰ anakin87/tulu-3-sft-mixture-with-language

  • 1 reply
ยท
anakin87ย 
posted an update 9 months ago
view post
Post
444
๐Ÿ๐Ÿ๐Ÿ ๐€ ๐’๐ฐ๐š๐ซ๐ฆ ๐จ๐Ÿ ๐€๐ ๐ž๐ง๐ญ๐ฌ ๐ฐ๐ข๐ญ๐ก ๐‹๐ฅ๐š๐ฆ๐š 3.2, ๐†๐๐“-4๐จ ๐ฆ๐ข๐ง๐ข ๐š๐ง๐ ๐‚๐ฅ๐š๐ฎ๐๐ž 3.5 ๐’๐จ๐ง๐ง๐ž๐ญ

๐“๐‹;๐ƒ๐‘: I reimplemented the Swarm concept using Haystack, but made it work with both open and proprietary models ๐Ÿ’ซ

โœ๏ธ blog article: https://haystack.deepset.ai/blog/swarm-of-agents
๐Ÿ““ notebook: https://haystack.deepset.ai/cookbook/swarm


Some time ago OpenAI published Swarm: an educational framework for building multi-agent systems.

Their approach focuses on two main concepts:
ใƒป ๐‘๐จ๐ฎ๐ญ๐ข๐ง๐ž๐ฌ: Each agent follows specific ๐Ÿ“œ instructions and uses ๐Ÿ› ๏ธ tools to execute them.
ใƒป ๐‡๐š๐ง๐๐จ๐Ÿ๐Ÿ๐ฌ ๐Ÿค: Agents can transfer control to one another using tool/function calling.


When I first read these ideas, I thought: ๐˜ด๐˜ช๐˜ฎ๐˜ฑ๐˜ญ๐˜ฆ ๐˜ฃ๐˜ถ๐˜ต ๐˜ฑ๐˜ฐ๐˜ธ๐˜ฆ๐˜ณ๐˜ง๐˜ถ๐˜ญ! And they pair well with the recent unified tool support in Haystack.

๐Ÿง‘โ€๐Ÿ’ป So, I decided to re-implement these concepts using Haystack, and in just a few lines of code, I had a working prototype.

๐Ÿ†’ Bonus feature: this implementation isn't tied to a single model provider - different agents can be powered by different models!

I replicated the ACME customer service example from the original article, with 3 Agents:
๐Ÿ Triage Agent - Llama 3.2 running on Ollama
๐Ÿ Sales Agent - Anthropic Claude 3.5 Sonnet
๐Ÿ Issues and Repairs Agent - OpenAI GPT-4o mini


Want to see the full implementation and give it a try? Check out the blog post and notebook! โœจ

Update README.md

1
#29 opened 9 months ago by
Piterasny
anakin87ย 
posted an update 10 months ago
view post
Post
1112
Ok, you're finally convinced that synthetic data works... โš—๏ธ

๐๐จ๐ฐ ๐ฒ๐จ๐ฎ ๐ฐ๐š๐ง๐ญ ๐ญ๐จ ๐ ๐ž๐ง๐ž๐ซ๐š๐ญ๐ž ๐š๐ง ๐ข๐ง๐ฌ๐ญ๐ซ๐ฎ๐œ๐ญ๐ข๐จ๐ง ๐๐š๐ญ๐š๐ฌ๐ž๐ญ ๐Ÿ๐จ๐ซ ๐Ÿ๐ข๐ง๐ž-๐ญ๐ฎ๐ง๐ข๐ง๐  ๐ข๐ง ๐š ๐ฅ๐š๐ง๐ ๐ฎ๐š๐ ๐ž ๐จ๐ญ๐ก๐ž๐ซ ๐ญ๐ก๐š๐ง ๐„๐ง๐ ๐ฅ๐ข๐ฌ๐ก.
But how do you get started?

I explore how to do this with Magpie in my new article
https://huggingface.co/blog/anakin87/multilingual-magpie

---

๐Ÿฆโ€โฌ› ๐–๐ก๐š๐ญ ๐ข๐ฌ ๐Œ๐š๐ ๐ฉ๐ข๐ž?

It's a recent technique for creating synthetic instruction datasets.

Magpie is based on a simple but ingenious idea ๐Ÿ‘‡
if you prompt an instruction-tuned model with a pre-query template, you can make it generate a plausible user query/instruction

Here's an example:
model: Llama-3-8B-Instruct
pre-query template: "<|begin_of_text|><|start_header_id|>user<|end_header_id|>"
generated user instruction: "What are some of the responsibilities of a commercial pilot?"

You can then feed this instruction back into the same model to get the assistant response.

By repeating this process, it's possible to generate large synthetic datasets with relatively little effort.

๐Ÿช„ The authors demonstrate that using these datasets for Supervised Fine Tuning (SFT) can yield strong performance, even competitive with the original instruct model.


๐Ÿง—๐†๐ž๐ง๐ž๐ซ๐š๐ญ๐ข๐ง๐  ๐ง๐จ๐ง-๐„๐ง๐ ๐ฅ๐ข๐ฌ๐ก ๐๐š๐ญ๐š

Most Language Models are primarily trained on English texts, so they tend to produce data in English.

How can we overcome this?

Earlier approaches were complex or costly.

Then @mrm8488 found a simple solution: add the target language to the pre-query template.
For Spanish, the template becomes "<|begin_of_text|><|start_header_id|>user<|end_header_id|>spanish:".

This method works for Spanish and German!

โŒ Unfortunately, it does not work well for other languages (๐Ÿ‡ฎ๐Ÿ‡น, ๐Ÿ‡ณ๐Ÿ‡ฑ, ...)

๐Ÿ‘‡
  • 1 reply
ยท
anakin87ย 
posted an update 11 months ago
view post
Post
1769
๐Ÿ•ต๐Ÿป ๐€๐ ๐ž๐ง๐ญ๐ข๐œ ๐‘๐€๐† ๐ฐ๐ข๐ญ๐ก ๐Ÿฆ™ ๐‹๐ฅ๐š๐ฆ๐š 3.2

I was excited to explore Llama 3.2, but as a simple ๐Ÿ‡ช๐Ÿ‡บ EU guy, I don't have access to Meta's multimodal models ๐Ÿ˜ฟ

๐Ÿค” So I thought: why not challenge the small 3B text model with Agentic RAG?

๐ŸŽฏ The plan:
- Build a system that tries to answer questions using a knowledge base.
- If the documents don't contain the answer, use Web search for additional context.


Check out my experimental notebook here: ๐Ÿ““ https://colab.research.google.com/github/deepset-ai/haystack-cookbook/blob/main/notebooks/llama32_agentic_rag.ipynb


My stack:
๐Ÿ—๏ธ haystack (https://haystack.deepset.ai/): open-source LLM orchestration framework
๐Ÿฆ™ meta-llama/Llama-3.2-3B-Instruct
๐Ÿฆ†๐ŸŒ free DuckDuckGo API, integrated with Haystack

โœจ ๐˜›๐˜ฉ๐˜ฆ ๐˜ณ๐˜ฆ๐˜ด๐˜ถ๐˜ญ๐˜ต๐˜ด? ๐˜Œ๐˜ฏ๐˜ค๐˜ฐ๐˜ถ๐˜ณ๐˜ข๐˜จ๐˜ช๐˜ฏ๐˜จ - ๐˜ข ๐˜ง๐˜ฆ๐˜ธ ๐˜ฎ๐˜ฐ๐˜ฏ๐˜ต๐˜ฉ๐˜ด ๐˜ข๐˜จ๐˜ฐ, ๐˜ต๐˜ฉ๐˜ช๐˜ด ๐˜ญ๐˜ฆ๐˜ท๐˜ฆ๐˜ญ ๐˜ฐ๐˜ง ๐˜ฑ๐˜ฆ๐˜ณ๐˜ง๐˜ฐ๐˜ณ๐˜ฎ๐˜ข๐˜ฏ๐˜ค๐˜ฆ ๐˜ง๐˜ณ๐˜ฐ๐˜ฎ ๐˜ข ๐˜ด๐˜ฎ๐˜ข๐˜ญ๐˜ญ ๐˜ฎ๐˜ฐ๐˜ฅ๐˜ฆ๐˜ญ ๐˜ธ๐˜ฐ๐˜ถ๐˜ญ๐˜ฅ'๐˜ท๐˜ฆ ๐˜ฃ๐˜ฆ๐˜ฆ๐˜ฏ ๐˜ถ๐˜ฏ๐˜ต๐˜ฉ๐˜ช๐˜ฏ๐˜ฌ๐˜ข๐˜ฃ๐˜ญ๐˜ฆ!
This probably reflects the impressive IFEval score of the model (comparable to Llama 3.1 8B).