-
Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V
Paper β’ 2310.11441 β’ Published β’ 29 -
UI-TARS: Pioneering Automated GUI Interaction with Native Agents
Paper β’ 2501.12326 β’ Published β’ 64 -
GUI Odyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile Devices
Paper β’ 2406.08451 β’ Published β’ 26 -
GUI-WORLD: A Dataset for GUI-oriented Multimodal LLM-based Agents
Paper β’ 2406.10819 β’ Published β’ 2
Aymeric Roucher PRO
m-ric
AI & ML interests
Used to work at Hugging Face π€
Recent Activity
upvoted an article 5 days ago
Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries upvoted an article 14 days ago
We Got Claude to Fine-Tune an Open Source LLM liked a model about 2 months ago
LightningRodLabs/Trump-ForecasterOrganizations
Scaling Laws π
-
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Paper β’ 2206.10789 β’ Published β’ 4 -
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws
Paper β’ 2401.00448 β’ Published β’ 30 -
Training Compute-Optimal Large Language Models
Paper β’ 2203.15556 β’ Published β’ 11 -
Scaling Laws for Neural Language Models
Paper β’ 2001.08361 β’ Published β’ 10
π§ββοΈ LLM-as-a-judge
-
Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
Paper β’ 2306.05685 β’ Published β’ 41 -
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
Paper β’ 2312.10003 β’ Published β’ 44 -
Leveraging Large Language Models for NLG Evaluation: A Survey
Paper β’ 2401.07103 β’ Published β’ 4 -
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
Paper β’ 2310.08491 β’ Published β’ 57
π€ Agents
-
DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines
Paper β’ 2310.03714 β’ Published β’ 37 -
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
Paper β’ 2312.10003 β’ Published β’ 44 -
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework
Paper β’ 2308.08155 β’ Published β’ 11 -
GAIA: a benchmark for General AI Assistants
Paper β’ 2311.12983 β’ Published β’ 247
π£οΈ Grammar
-
Grammar-Constrained Decoding for Structured NLP Tasks without Finetuning
Paper β’ 2305.13971 β’ Published β’ 5 -
Autoregressive Entity Retrieval
Paper β’ 2010.00904 β’ Published β’ 1 -
PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models
Paper β’ 2109.05093 β’ Published β’ 1
LLM foundations
-
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
Paper β’ 2404.02258 β’ Published β’ 107 -
Textbooks Are All You Need
Paper β’ 2306.11644 β’ Published β’ 154 -
Jamba: A Hybrid Transformer-Mamba Language Model
Paper β’ 2403.19887 β’ Published β’ 112 -
Large Language Models Struggle to Learn Long-Tail Knowledge
Paper β’ 2211.08411 β’ Published β’ 3
π Earth
-
jonathan-roberts1/EuroSAT
Viewer β’ Updated β’ 27k β’ 237 β’ 3 - Running on L4Featured84
Major TOM Viewer
π84Quick View of Samples in the MajorTOM-Core Dataset
-
Major-TOM/Core-S2L2A
Viewer β’ Updated β’ 4.49M β’ 5.4k β’ 66 - Runtime errorFeatured150
ClimateQ&A
π150Ask any questions to the IPCC and IPBES reports
Mother of all Training Clusters
https://github.com/NousResearch/DisTrO/blob/main/A_Preliminary_Report_on_DisTrO.pdf
Could be useful one day
π Spinning Up in LLMs
-
Lost in the Middle: How Language Models Use Long Contexts
Paper β’ 2307.03172 β’ Published β’ 44 -
Efficient Estimation of Word Representations in Vector Space
Paper β’ 1301.3781 β’ Published β’ 8 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper β’ 1810.04805 β’ Published β’ 26 -
Attention Is All You Need
Paper β’ 1706.03762 β’ Published β’ 120
πβπ¬ RAG
-
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Paper β’ 2005.11401 β’ Published β’ 14 -
RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture
Paper β’ 2401.08406 β’ Published β’ 38 -
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models
Paper β’ 2104.08663 β’ Published β’ 3 -
Precise Zero-Shot Dense Retrieval without Relevance Labels
Paper β’ 2212.10496 β’ Published β’ 5
ποΈ Vision
π‘ Interpretability - understanding LLMs
-
Linearity of Relation Decoding in Transformer Language Models
Paper β’ 2308.09124 β’ Published β’ 2 -
Chain-of-Thought Reasoning Without Prompting
Paper β’ 2402.10200 β’ Published β’ 109 -
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
Paper β’ 2404.02258 β’ Published β’ 107 -
Mission: Impossible Language Models
Paper β’ 2401.06416 β’ Published β’ 3
π§ Optimization Mechanics π§
-
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Paper β’ 2210.17323 β’ Published β’ 10 -
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
Paper β’ 2208.07339 β’ Published β’ 5 -
Hydragen: High-Throughput LLM Inference with Shared Prefixes
Paper β’ 2402.05099 β’ Published β’ 20 -
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
Paper β’ 2401.10774 β’ Published β’ 60
Open-source AI Releases - August '24
-
nvidia/Mistral-NeMo-Minitron-8B-Base
Text Generation β’ 8B β’ Updated β’ 3.12k β’ 179 - Running54
Instant SmolLM
π€54Run SmolLM-360M-Instruct in realtime with MLC WebLLM
-
black-forest-labs/FLUX.1-schnell
Text-to-Image β’ Updated β’ 752k β’ β’ 4.75k - Running on ZeroFeatured5.05k
FLUX.1 [Schnell]
π5.05kGenerate images from text prompts with FLUX.1 Schnell
GUI Agents
-
Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V
Paper β’ 2310.11441 β’ Published β’ 29 -
UI-TARS: Pioneering Automated GUI Interaction with Native Agents
Paper β’ 2501.12326 β’ Published β’ 64 -
GUI Odyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile Devices
Paper β’ 2406.08451 β’ Published β’ 26 -
GUI-WORLD: A Dataset for GUI-oriented Multimodal LLM-based Agents
Paper β’ 2406.10819 β’ Published β’ 2
Could be useful one day
Scaling Laws π
-
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Paper β’ 2206.10789 β’ Published β’ 4 -
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws
Paper β’ 2401.00448 β’ Published β’ 30 -
Training Compute-Optimal Large Language Models
Paper β’ 2203.15556 β’ Published β’ 11 -
Scaling Laws for Neural Language Models
Paper β’ 2001.08361 β’ Published β’ 10
π Spinning Up in LLMs
-
Lost in the Middle: How Language Models Use Long Contexts
Paper β’ 2307.03172 β’ Published β’ 44 -
Efficient Estimation of Word Representations in Vector Space
Paper β’ 1301.3781 β’ Published β’ 8 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper β’ 1810.04805 β’ Published β’ 26 -
Attention Is All You Need
Paper β’ 1706.03762 β’ Published β’ 120
π§ββοΈ LLM-as-a-judge
-
Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
Paper β’ 2306.05685 β’ Published β’ 41 -
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
Paper β’ 2312.10003 β’ Published β’ 44 -
Leveraging Large Language Models for NLG Evaluation: A Survey
Paper β’ 2401.07103 β’ Published β’ 4 -
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
Paper β’ 2310.08491 β’ Published β’ 57
πβπ¬ RAG
-
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Paper β’ 2005.11401 β’ Published β’ 14 -
RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture
Paper β’ 2401.08406 β’ Published β’ 38 -
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models
Paper β’ 2104.08663 β’ Published β’ 3 -
Precise Zero-Shot Dense Retrieval without Relevance Labels
Paper β’ 2212.10496 β’ Published β’ 5
π€ Agents
-
DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines
Paper β’ 2310.03714 β’ Published β’ 37 -
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
Paper β’ 2312.10003 β’ Published β’ 44 -
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework
Paper β’ 2308.08155 β’ Published β’ 11 -
GAIA: a benchmark for General AI Assistants
Paper β’ 2311.12983 β’ Published β’ 247
ποΈ Vision
π£οΈ Grammar
-
Grammar-Constrained Decoding for Structured NLP Tasks without Finetuning
Paper β’ 2305.13971 β’ Published β’ 5 -
Autoregressive Entity Retrieval
Paper β’ 2010.00904 β’ Published β’ 1 -
PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models
Paper β’ 2109.05093 β’ Published β’ 1
π‘ Interpretability - understanding LLMs
-
Linearity of Relation Decoding in Transformer Language Models
Paper β’ 2308.09124 β’ Published β’ 2 -
Chain-of-Thought Reasoning Without Prompting
Paper β’ 2402.10200 β’ Published β’ 109 -
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
Paper β’ 2404.02258 β’ Published β’ 107 -
Mission: Impossible Language Models
Paper β’ 2401.06416 β’ Published β’ 3
LLM foundations
-
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
Paper β’ 2404.02258 β’ Published β’ 107 -
Textbooks Are All You Need
Paper β’ 2306.11644 β’ Published β’ 154 -
Jamba: A Hybrid Transformer-Mamba Language Model
Paper β’ 2403.19887 β’ Published β’ 112 -
Large Language Models Struggle to Learn Long-Tail Knowledge
Paper β’ 2211.08411 β’ Published β’ 3
π§ Optimization Mechanics π§
-
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Paper β’ 2210.17323 β’ Published β’ 10 -
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
Paper β’ 2208.07339 β’ Published β’ 5 -
Hydragen: High-Throughput LLM Inference with Shared Prefixes
Paper β’ 2402.05099 β’ Published β’ 20 -
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
Paper β’ 2401.10774 β’ Published β’ 60
π Earth
-
jonathan-roberts1/EuroSAT
Viewer β’ Updated β’ 27k β’ 237 β’ 3 - Running on L4Featured84
Major TOM Viewer
π84Quick View of Samples in the MajorTOM-Core Dataset
-
Major-TOM/Core-S2L2A
Viewer β’ Updated β’ 4.49M β’ 5.4k β’ 66 - Runtime errorFeatured150
ClimateQ&A
π150Ask any questions to the IPCC and IPBES reports
Open-source AI Releases - August '24
-
nvidia/Mistral-NeMo-Minitron-8B-Base
Text Generation β’ 8B β’ Updated β’ 3.12k β’ 179 - Running54
Instant SmolLM
π€54Run SmolLM-360M-Instruct in realtime with MLC WebLLM
-
black-forest-labs/FLUX.1-schnell
Text-to-Image β’ Updated β’ 752k β’ β’ 4.75k - Running on ZeroFeatured5.05k
FLUX.1 [Schnell]
π5.05kGenerate images from text prompts with FLUX.1 Schnell
Mother of all Training Clusters
https://github.com/NousResearch/DisTrO/blob/main/A_Preliminary_Report_on_DisTrO.pdf