SFEvent (Open-Source AI Meetup)

Nymbo

posted an update about 2 months ago

Post

2584

Anyone know how to reset Claude web's MCP config? I connected mine when the HF MCP first released with just the default example spaces added. I added lots of other MCP spaces but Claude.ai doesn't update the available tools... "Disconnecting" the HF integration does nothing, deleting it and adding it again does nothing.

Refreshing tools works fine in VS Code because I can manually restart it in mcp.json, but claude.ai has no such option. Anyone got any ideas?

3 replies

·

Nymbo

posted an update 3 months ago

Post

4042

Haven't seen this posted anywhere - Llama-3.3-8B-Instruct is available on the new Llama API. Is this a new model or did someone mislabel Llama-3.1-8B?

1 reply

·

Nymbo

posted an update 4 months ago

Post

2760

PSA for anyone using Nymbo/Nymbo_Theme or Nymbo/Nymbo_Theme_5 in a Gradio space ~

Both of these themes have been updated to fix some of the long-standing inconsistencies ever since the transition to Gradio v5. Textboxes are no longer bright green and in-line code is readable now! Both themes are now visually identical across versions.

If your space is already using one of these themes, you just need to restart your space to get the latest version. No code changes needed.

Yosun

posted an update 4 months ago

Post

528

Is it possible to pay for more ZeroGPU usage quota?

3 replies

·

osanseviero

authored a paper 5 months ago

Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25 • 53

jxm

posted an update 7 months ago

Post

1122

New state-of-the-art BERT-size retrieval model: *cde-small-v2* 🥳🍾

Hi everyone! We at Cornell are releasing a new retrieval model this week. It uses the contextual embeddings framework, is based on ModernBERT backbone, and gets state-of-the-art results on the MTEB benchmark for its model size (140M parameters). cde-small-v2 gets an average score of 65.6 across the 56 datasets and sees improvements from our previous model in *every* task domain (retrieval, classification, etc.).

We made a lot of changes to make this model work. First of all, ModernBERT has a better tokenizer, which probably helped this work out-of-the-box. We also followed the principles from the CDE paper and used harder clusters and better hard-negative filtering, which showed a small performance improvement. And we made a few small changes that have been shown to work on the larger models: we disabled weight decay, masked out the prefix tokens during pooling, and added a residual connection from the first-stage to the second-stage for better gradient flow.

We're still looking for a computer sponsor to help us scale CDE to larger models. Since it's now state-of-the-art at the 100M parameter scale, it seems to be a reasonable bet that we could train a state-of-the-art large model if we had the GPUs. If you're interested in helping with this, please reach out!

Here's a link to the model: jxm/cde-small-v2
And here's a link to the paper: Contextual Document Embeddings (2410.02525)

akhaliq

posted an update 8 months ago

Post

26404

Google drops Gemini 2.0 Flash Thinking

a new experimental model that unlocks stronger reasoning capabilities and shows its thoughts. The model plans (with thoughts visible), can solve complex problems with Flash speeds, and more

now available in anychat, try it out: https://huggingface.co/spaces/akhaliq/anychat

4 replies

·

lunarflu

posted an update 8 months ago

Post

2510

great blogpost! 🔥@wolfram
https://huggingface.co/blog/wolfram/llm-comparison-test-2024-12-04

akhaliq

posted an update 9 months ago

Post

25949

QwQ-32B-Preview is now available in anychat

A reasoning model that is competitive with OpenAI o1-mini and o1-preview

try it out: https://huggingface.co/spaces/akhaliq/anychat

1 reply

·

akhaliq

posted an update 9 months ago

Post

4767

New model drop in anychat

allenai/Llama-3.1-Tulu-3-8B is now available

try it here: https://huggingface.co/spaces/akhaliq/anychat

akhaliq

posted an update 9 months ago

Post

3584

anychat

supports chatgpt, gemini, perplexity, claude, meta llama, grok all in one app

try it out there: https://huggingface.co/spaces/akhaliq/anychat

MaximumEntropy

authored a paper 10 months ago

Pixtral 12B

Paper • 2410.07073 • Published Oct 9, 2024 • 67

lunarflu

posted an update 12 months ago

Post

1977

@Blane187 could you please modify the title of your blogpost? content is cool, title could be nicer imo https://huggingface.co/blog/Blane187/wtf-is-rvc

3 replies

·

lunarflu

posted an update about 1 year ago

Post

1985

Cool things this week from @huggingface !

🌎AI math olympiad winner NuminaMath is here!
🤗Announcing New Hugging Face and Keras NLP integration
✨UI overhaul to HF tokens!
🧊 Embed our dataset viewer on any webpage!

https://huggingface.co/blog/winning-aimo-progress-prize
https://huggingface.co/blog/keras-nlp-integration
https://huggingface.co/settings/tokens
https://x.com/julien_c/status/1812099420726456457

Check out the full list on our discord! 👇
https://discord.com/invite/JfAtkvEtRb

lunarflu

posted an update about 1 year ago

Post

2389

By popular demand, HF activity tracker v1.0 is here! 📊 let's build it together!🤗

Lots of things to improve, feel free to open PRs in the community tab!

good PR ideas:
- track more types of actions that include date+time
- bigger plot
- track discord activity too 🤯
- link github? ⚡

https://huggingface.co/spaces/huggingface-projects/LevelBot

2 replies

·

akhaliq

posted an update about 1 year ago

Post

21071

Phased Consistency Model

Phased Consistency Model (2405.18407)

The consistency model (CM) has recently made significant progress in accelerating the generation of diffusion models. However, its application to high-resolution, text-conditioned image generation in the latent space (a.k.a., LCM) remains unsatisfactory. In this paper, we identify three key flaws in the current design of LCM. We investigate the reasons behind these limitations and propose the Phased Consistency Model (PCM), which generalizes the design space and addresses all identified limitations. Our evaluations demonstrate that PCM significantly outperforms LCM across 1--16 step generation settings. While PCM is specifically designed for multi-step refinement, it achieves even superior or comparable 1-step generation results to previously state-of-the-art specifically designed 1-step methods. Furthermore, we show that PCM's methodology is versatile and applicable to video generation, enabling us to train the state-of-the-art few-step text-to-video generator.

lunarflu

posted an update about 1 year ago

Post

2013

Weekly highlights for the HF ecosystem!

🚀 Phi 3
🦅 Falcon VLM
🤗 sentence-transformers v3.0 is here! Train and finetune embedding models with multi-GPU training, bf16 support, loss logging, callbacks and more!
🥳 Gradio launch event 6/6! We're launching 1.0 versions of two new libraries, Python + JS client libraries to programmatically query Gradio apps, and several new features making it easier to use Gradio apps in production!
✨ Tools now available in HuggingChat! Use any AI apps built by the community! 🔥
🧊 ML for 3D Course Unit 3 is here! Covering Gaussian splatting, how it fits in the generative 3D pipeline, and hands-on code to build your own demo!

See the full list here!
https://discord.com/channels/879548962464493619/897387888663232554/1245036889539612764 !

2 replies

·

lunarflu

posted an update about 1 year ago

Post

2067

cooking up something....anyone interested in a daily activity tracker for HF?

12 replies

·

akhaliq

posted an update about 1 year ago

Post

21277

Chameleon

Mixed-Modal Early-Fusion Foundation Models

Chameleon: Mixed-Modal Early-Fusion Foundation Models (2405.09818)

We present Chameleon, a family of early-fusion token-based mixed-modal models capable of understanding and generating images and text in any arbitrary sequence. We outline a stable training approach from inception, an alignment recipe, and an architectural parameterization tailored for the early-fusion, token-based, mixed-modal setting. The models are evaluated on a comprehensive range of tasks, including visual question answering, image captioning, text generation, image generation, and long-form mixed modal generation. Chameleon demonstrates broad and general capabilities, including state-of-the-art performance in image captioning tasks, outperforms Llama-2 in text-only tasks while being competitive with models such as Mixtral 8x7B and Gemini-Pro, and performs non-trivial image generation, all in a single model. It also matches or exceeds the performance of much larger models, including Gemini Pro and GPT-4V, according to human judgments on a new long-form mixed-modal generation evaluation, where either the prompt or outputs contain mixed sequences of both images and text. Chameleon marks a significant step forward in a unified modeling of full multimodal documents.

lunarflu

posted an update about 1 year ago

Post

1257

unsloth just crossed 1M+ downloads! 🤯

Some of the most popular 👀 :
unsloth/llama-3-8b-bnb-4bit
unsloth/llama-3-8b-Instruct-bnb-4bit
unsloth/mistral-7b-instruct-v0.2-bnb-4bit

Open-Source AI Meetup

AI & ML interests

Recent Activity

Gemma 3 Technical Report

Pixtral 12B

AI & ML interests

Recent Activity

Team members 580

SFEvent's activity