|
# FashionM3 Project Setup Guide |
|
|
|
This guide explains how to set up and run the FashionM3 project, which requires the FashionRec dataset, environment configuration, and specific server startup steps. |
|
|
|
**Important Note on Hugging Face Space Deployment:** |
|
Due to the high computational requirements of the underlying Vision-Language Model (VLM) and associated services, this Hugging Face Space currently serves primarily as a **code repository**. The full interactive demo **cannot be run directly on the free tier of Hugging Face Spaces**. Please follow the instructions below to set up and run the FashionM3 application locally on your machine. |
|
|
|
## Project Overview |
|
The work is introduced in this paper FashionM3: Multimodal, Multitask, and Multiround Fashion Assistant based on Unified Vision-Language Model - https://arxiv.org/abs/2504.17826 |
|
|
|
## Prerequisites |
|
|
|
- Python 3.10 or higher |
|
- `pip` for installing dependencies |
|
- A working proxy server (if needed, e.g., at `http://127.0.0.1:10809`) |
|
|
|
## Step 1: Download the FashionRec Dataset |
|
|
|
1. Download the FashionRec dataset from https://huggingface.co/datasets/Anony100/FashionRec |
|
2. Extract the dataset to a directory of your choice (e.g., `/path/to/FashionRec`). |
|
3. Note the absolute path to the dataset directory, as it will be used in the `.env` file. |
|
|
|
## Step 2: Configure the Environment File |
|
|
|
Create a `.env` file in the project root directory with the following content: |
|
|
|
```plaintext |
|
# Chainlit server port |
|
CHAINLIT_PORT=8888 |
|
|
|
# Proxy configuration (update or remove if not needed) |
|
PROXY=http://127.0.0.1:10809 |
|
|
|
# API keys (replace with your own keys) |
|
OPENAI_API_KEY=your_openai_api_key_here |
|
GEMINI_API_KEY=your_gemini_api_key_here |
|
|
|
# Path to the FashionRec dataset (update to your dataset path) |
|
FASHION_DATA_ROOT=/path/to/FashionRec |
|
|
|
# Directory for generated images |
|
GEN_IMG_DIR=./generated_images |
|
``` |
|
|
|
### Notes: |
|
- Replace `your_openai_api_key_here` and `your_gemini_api_key_here` with your actual OpenAI and Google Gemini API keys. |
|
- Set FASHION_DATA_ROOT to the absolute path of the FashionRec dataset (e.g., /home/user/data/FashionRec). |
|
- Update PROXY to match your proxy server, or remove it if no proxy is used. |
|
|
|
## Step 3: Install Dependencies |
|
|
|
Install the required Python packages: |
|
```bash |
|
pip install -r requirements.txt |
|
``` |
|
|
|
## Step 4: Run the Application |
|
Follow these steps to start the FashionM3 application: |
|
### 1. Start the Fashion VLM MCP Server: |
|
Run the MCP server for the fashion vision-language model: |
|
```bash |
|
python mcp_servers/fashion_vlm/main.py |
|
``` |
|
|
|
Ensure the server starts successfully and remains running. |
|
|
|
### 2. Start the FashionM3 Client: |
|
Launch the Chainlit client to interact with the Fashion Assistant: |
|
```bash |
|
chainlit run chainlit_app.py --port 8888 |
|
``` |
|
|
|
### 3. Interact with the Fashion Assistant: |
|
Open your browser and navigate to: |
|
```plaintext |
|
http://localhost:8888/ |
|
``` |
|
|
|
This will load the FashionM3 interface, allowing you to interact with the Fashion Assistant. |
|
|
|
|
|
## Citation |
|
If you find this work helpful, please consider citing our paper: |
|
|
|
``` |
|
@article{pang2025fashionm3, |
|
title={FashionM3: Multimodal, Multitask, and Multiround Fashion Assistant based on Unified Vision-Language Model}, |
|
author={Pang, Kaicheng and Zou, Xingxing and Wong, Waikeung}, |
|
journal={arXiv preprint arXiv:2504.17826}, |
|
year={2025} |
|
} |
|
``` |