Spaces:

Vision-CAIR
/

MiniGPT-v2

Runtime error

App Files Files Community

Vision-CAIR commited on Oct 15, 2023

Commit

70fcd58

1 Parent(s): e81e82a

Delete README.md

Browse files

Files changed (1) hide show

README.md +0 -182

README.md DELETED Viewed

@@ -1,182 +0,0 @@
-# MiniGPT-V
-<font size='5'>**MiniGPT-v2: Large Language Model as a Unified Interface for Vision-Language Multi-task Learning**</font>
-Jun Chen, Deyao Zhu, Xiaoqian Shen, Xiang Li, Zechun Liu, Pengchuan Zhang, Raghuraman Krishnamoorthi, Vikas Chandra, Yunyang Xiong☨, Mohamed Elhoseiny☨
-☨equal last author
-<a href='https://minigpt-v2.github.io'><img src='https://img.shields.io/badge/Project-Page-Green'></a> <a href='https://github.com/Vision-CAIR/MiniGPT-4/blob/main/MiniGPTv2.pdf'><img src='https://img.shields.io/badge/Paper-PDF-red'></a>  <a href='https://minigpt-v2.github.io'><img src='https://img.shields.io/badge/Gradio-Demo-blue'></a> [![YouTube](https://badges.aleen42.com/src/youtube.svg)](https://www.youtube.com/watch?v=atFCwV2hSY4)
-<font size='5'>**MiniGPT-4: Enhancing Vision-language Understanding with Advanced Large Language Models**</font>
-Deyao Zhu*, Jun Chen*, Xiaoqian Shen, Xiang Li, Mohamed Elhoseiny
-*equal contribution
-<a href='https://minigpt-4.github.io'><img src='https://img.shields.io/badge/Project-Page-Green'></a>  <a href='https://arxiv.org/abs/2304.10592'><img src='https://img.shields.io/badge/Paper-Arxiv-red'></a> <a href='https://huggingface.co/spaces/Vision-CAIR/minigpt4'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue'></a> <a href='https://huggingface.co/Vision-CAIR/MiniGPT-4'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Model-blue'></a> [![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1OK4kYsZphwt5DXchKkzMBjYF6jnkqh4R?usp=sharing) [![YouTube](https://badges.aleen42.com/src/youtube.svg)](https://www.youtube.com/watch?v=__tftoxpBAw&feature=youtu.be)
-*King Abdullah University of Science and Technology*
-## 💡 Get help - [Q&A](https://github.com/Vision-CAIR/MiniGPT-4/discussions/categories/q-a) or [Discord 💬](https://discord.gg/5WdJkjbAeE)
-## News
-[Oct.13 2023] Breaking! We release the first major update with our MiniGPT-v2
-[Aug.28 2023] We now provide a llama 2 version of MiniGPT-4
-## Online Demo
-Click the image to chat with MiniGPT-v2 around your images
-[![demo](figs/minigpt2_demo.png)](https://minigpt-v2.github.io/)
-Click the image to chat with MiniGPT-4 around your images
-[![demo](figs/online_demo.png)](https://minigpt-4.github.io)
-## MiniGPT-v2 Examples
-![MiniGPT-v2 demos](figs/demo.png)
-## MiniGPT-4 Examples
-  |   |   |
-:-------------------------:|:-------------------------:
-![find wild](figs/examples/wop_2.png) |  ![write story](figs/examples/ad_2.png)
-![solve problem](figs/examples/fix_1.png)  |  ![write Poem](figs/examples/rhyme_1.png)
-More examples can be found in the [project page](https://minigpt-4.github.io).
-## Getting Started
-### Installation
-**1. Prepare the code and the environment**
-Git clone our repository, creating a python environment and activate it via the following command
-```bash
-git clone https://github.com/Vision-CAIR/MiniGPT-4.git
-cd MiniGPT-4
-conda env create -f environment.yml
-conda activate minigpt4
-```
-**2. Prepare the pretrained LLM weights**
-**MiniGPT-v2** is based on Llama2 Chat 7B. For **MiniGPT-4**, we have both Vicuna V0 and Llama 2 version.
-Download the corresponding LLM weights from the following huggingface space via clone the repository using git-lfs.
-|                            Llama 2 Chat 7B                             |                                           Vicuna V0 13B                                           |                                          Vicuna V0 7B                                          |
-:------------------------------------------------------------------------------------------------:|:----------------------------------------------------------------------------------------------:|:----------------------------------------------------------------------------------------------:
-[Download](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf/tree/main) | [Downlad](https://huggingface.co/Vision-CAIR/vicuna/tree/main) | [Download](https://huggingface.co/Vision-CAIR/vicuna-7b/tree/main)
-Then, set the variable *llama_model* in the model config file to the LLM weight path.
-* For MiniGPT-v2, set the LLM path
-[here](minigpt4/configs/models/minigpt_v2.yaml#L15) at Line 14.
-* For MiniGPT-4 (Llama2), set the LLM path
-[here](minigpt4/configs/models/minigpt4_llama2.yaml#L15) at Line 15.
-* For MiniGPT-4 (Vicuna), set the LLM path
-[here](minigpt4/configs/models/minigpt4_vicuna0.yaml#L18) at Line 18
-**3. Prepare the pretrained model checkpoints**
-Download the pretrained model checkpoints
-| MiniGPT-v2 (LLaMA-2 Chat 7B) |
-|------------------------------|
-| [Download](https://drive.google.com/file/d/1aVbfW7nkCSYx99_vCRyP1sOlQiWVSnAl/view?usp=sharing) |
-For **MiniGPT-v2**, set the path to the pretrained checkpoint in the evaluation config file
-in [eval_configs/minigptv2_eval.yaml](eval_configs/minigptv2_eval.yaml#L10) at Line 8.
-| MiniGPT-4 (Vicuna 13B) | MiniGPT-4 (Vicuna 7B) | MiniGPT-4 (LLaMA-2 Chat 7B) |
-|----------------------------|---------------------------|---------------------------------|
-| [Download](https://drive.google.com/file/d/1a4zLvaiDBr-36pasffmgpvH5P7CKmpze/view?usp=share_link) | [Download](https://drive.google.com/file/d/1RY9jV0dyqLX-o38LrumkKRh6Jtaop58R/view?usp=sharing) | [Download](https://drive.google.com/file/d/11nAPjEok8eAGGEG1N2vXo3kBLCg0WgUk/view?usp=sharing) |
-For **MiniGPT-4**, set the path to the pretrained checkpoint in the evaluation config file
-in [eval_configs/minigpt4_eval.yaml](eval_configs/minigpt4_eval.yaml#L10) at Line 8 for Vicuna version or [eval_configs/minigpt4_llama2_eval.yaml](eval_configs/minigpt4_llama2_eval.yaml#L10) for LLama2 version.
-### Launching Demo Locally
-For MiniGPT-v2, run
-```
-python demo_v2.py --cfg-path eval_configs/minigpt4v2_eval.yaml  --gpu-id 0
-```
-For MiniGPT-4 (Vicuna version), run
-```
-python demo.py --cfg-path eval_configs/minigpt4_eval.yaml  --gpu-id 0
-```
-For MiniGPT-4 (Llama2 version), run
-```
-python demo.py --cfg-path eval_configs/minigpt4_llama2_eval.yaml  --gpu-id 0
-```
-To save GPU memory, LLMs loads as 8 bit by default, with a beam search width of 1.
-This configuration requires about 23G GPU memory for 13B LLM and 11.5G GPU memory for 7B LLM.
-For more powerful GPUs, you can run the model
-in 16 bit by setting `low_resource` to `False` in the relevant config file:
-* MiniGPT-v2: [minigptv2_eval.yaml](eval_configs/minigptv2_eval.yaml#6)
-* MiniGPT-4 (Llama2): [minigpt4_llama2_eval.yaml](eval_configs/minigpt4_llama2_eval.yaml#6)
-* MiniGPT-4 (Vicuna): [minigpt4_eval.yaml](eval_configs/minigpt4_eval.yaml#6)
-Thanks [@WangRongsheng](https://github.com/WangRongsheng), you can also run MiniGPT-4 on [Colab](https://colab.research.google.com/drive/1OK4kYsZphwt5DXchKkzMBjYF6jnkqh4R?usp=sharing)
-### Training
-For training details of MiniGPT-4, check [here](MiniGPT4_Train.md).
-## Acknowledgement
-+ [BLIP2](https://huggingface.co/docs/transformers/main/model_doc/blip-2) The model architecture of MiniGPT-4 follows BLIP-2. Don't forget to check this great open-source work if you don't know it before!
-+ [Lavis](https://github.com/salesforce/LAVIS) This repository is built upon Lavis!
-+ [Vicuna](https://github.com/lm-sys/FastChat) The fantastic language ability of Vicuna with only 13B parameters is just amazing. And it is open-source!
-+ [LLaMA](https://github.com/facebookresearch/llama) The strong open-sourced LLaMA 2 language model.
-If you're using MiniGPT-4/MiniGPT-v2 in your research or applications, please cite using this BibTeX:
-```bibtex
-@article{Chen2023minigpt,
-  title={MiniGPT-v2: Large Language Model as a Unified Interface for Vision-Language Multi-task Learning},
-  author={Chen, Jun and Zhu, Deyao and Shen, Xiaoqian and Li, Xiang and Liu, Zechu and Zhang, Pengchuan and Krishnamoorthi, Raghuraman and Chandra, Vikas and Xiong, Yunyang and Elhoseiny, Mohamed},
-  journal={github},
-  year={2023}
-}
-@article{zhu2023minigpt,
-  title={MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models},
-  author={Zhu, Deyao and Chen, Jun and Shen, Xiaoqian and Li, Xiang and Elhoseiny, Mohamed},
-  journal={arXiv preprint arXiv:2304.10592},
-  year={2023}
-}
-```
-## License
-This repository is under [BSD 3-Clause License](LICENSE.md).
-Many codes are based on [Lavis](https://github.com/salesforce/LAVIS) with
-BSD 3-Clause License [here](LICENSE_Lavis.md).