Spaces:
Running
Running
| The goal of the ML.ENERGY Leaderboard is to give people a sense of how much **energy** LLMs would consume, and the complex tradeoffs between energy, system performance, and user experience. | |
| The code for the leaderboard, backing data, and scripts for benchmarking are all open-source in our [repository](https://github.com/ml-energy/leaderboard). | |
| We'll see you at the [Discussion board](https://github.com/ml-energy/leaderboard/discussions), where you can ask questions, suggest improvement ideas, or just discuss leaderboard results! | |
| ## LLM Text Generation Benchmark | |
| This category includes LLM Chat, LLM Code, and VLM Visual Chat. | |
| ### Software | |
| - CUDA 12.4 | |
| - [vLLM](https://github.com/vllm-project/vllm) 0.5.4 -- For inference serving | |
| - [Zeus](https://ml.energy/zeus) -- For GPU time and energy measurement | |
| ### Hardware | |
| - NVIDIA A100-SXM4-40GB GPU (AWS p4d.24xlarge) | |
| - NVIDIA H100 80GB HBM3 GPU (AWS p5.48xlarge) | |
| ### Data | |
| | Task | Dataset | | |
| | -------------- | --------------- | | |
| | LLM Chat | 500 human prompts from [ShareGPT](https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered) | | |
| | LLM Code | HumanEval+ from [EvalPlus](https://github.com/evalplus/evalplus) | | |
| | VLM Visual Chat | 500 image and prompt pairs from the [LLaVA instruction dataset](https://huggingface.co/datasets/liuhaotian/LLaVA-Instruct-150K) | | |
| ### Benchmarking process | |
| We are interested in measuring the *steady-state* of the online serving system, excluding ramp up periods (when the server is gradually loaded with requests) and cooldown (when the server is draining the queue) periods. | |
| Therefore, we submit all the requests at the beginning of our benchmark while limiting the serving system's maximum batch size to create a steady-state serving system. | |
| The steady-state finishes when the serving system's queue length reaches zero, and we collect the timing and energy consumption of each batch during the steady-state to derive our metrics. | |
| The maximum batch size configuration is varied in order to change the system's utilization. | |
| Beyond a certain maximum batch size, the actual batch size does not increase due to memory constraints, which means the system will be overloaded, and we stop increasing the maximum batch size. | |
| ## Diffusion Benchmark | |
| This category includes Diffusion text to image, text to video, and image to video. | |
| ### Software | |
| - CUDA 12.4 | |
| - [HuggingFace Diffusers](https://github.com/huggingface/diffusers) 0.29.2 -- For inference | |
| - [Zeus](https://ml.energy/zeus) -- For GPU time and energy measurement | |
| ### Hardware | |
| - NVIDIA A100-SXM4-40GB GPU (AWS p4d.24xlarge) | |
| - NVIDIA H100 80GB HBM3 GPU (AWS p5.48xlarge) | |
| ### Data | |
| | Task | Dataset | | |
| | -------------- | --------------- | | |
| | Text to image | Prompts from [PartiPrompts](https://huggingface.co/datasets/nateraw/parti-prompts) | | |
| | Text to video | Captions from [ShareGPT4Video](https://huggingface.co/datasets/ShareGPT4Video/ShareGPT4Video) | | |
| | Image to video | Caption and first frame pairs from [ShareGPT4Video](https://huggingface.co/datasets/ShareGPT4Video/ShareGPT4Video) | | |
| ### Benchmarking process | |
| Since Diffusion model computations are more or less the same regardless of the input, we sample batches from each dataset and run them back-to-back to obtain stable measurements. | |
| The batch size is increased (in powers of two) until the GPU runs out of memory. | |
| ## The ML.ENERGY Initiative | |
| Are you interested in learning more about our ML energy measurement & optimization works? | |
| Meet us at the [**ML.ENERGY Initiative**](https://ml.energy) homepage! | |
| --- | |
| ## License | |
| This leaderboard is a research preview intended for non-commercial use only. | |
| Model weights were taken as is from the Hugging Face Hub if available and are subject to their licenses. | |
| Please direct inquiries/reports of potential violation to Jae-Won Chung. | |
| ## Contact | |
| Please direct general questions and issues related to the leaderboard to our GitHub repository's [discussion board](https://github.com/ml-energy/leaderboard/discussions). | |
| You can find the ML.ENERGY initiative members in [our homepage](https://ml.energy#members). | |
| If you need direct communication, please email [email protected]. | |