FashionM3 Project Setup Guide

This guide explains how to set up and run the FashionM3 project, which requires the FashionRec dataset, environment configuration, and specific server startup steps.

Important Note on Hugging Face Space Deployment: Due to the high computational requirements of the underlying Vision-Language Model (VLM) and associated services, this Hugging Face Space currently serves primarily as a code repository. The full interactive demo cannot be run directly on the free tier of Hugging Face Spaces. Please follow the instructions below to set up and run the FashionM3 application locally on your machine.

Project Overview

The work is introduced in this paper FashionM3: Multimodal, Multitask, and Multiround Fashion Assistant based on Unified Vision-Language Model - https://arxiv.org/abs/2504.17826

Prerequisites

Python 3.10 or higher
pip for installing dependencies
A working proxy server (if needed, e.g., at http://127.0.0.1:10809)

Step 1: Download the FashionRec Dataset

Download the FashionRec dataset from https://huggingface.co/datasets/Anony100/FashionRec
Extract the dataset to a directory of your choice (e.g., /path/to/FashionRec).
Note the absolute path to the dataset directory, as it will be used in the .env file.

Step 2: Configure the Environment File

Create a .env file in the project root directory with the following content:

# Chainlit server port
CHAINLIT_PORT=8888

# Proxy configuration (update or remove if not needed)
PROXY=http://127.0.0.1:10809

# API keys (replace with your own keys)
OPENAI_API_KEY=your_openai_api_key_here
GEMINI_API_KEY=your_gemini_api_key_here

# Path to the FashionRec dataset (update to your dataset path)
FASHION_DATA_ROOT=/path/to/FashionRec

# Directory for generated images
GEN_IMG_DIR=./generated_images

Notes:

Replace your_openai_api_key_here and your_gemini_api_key_here with your actual OpenAI and Google Gemini API keys.
Set FASHION_DATA_ROOT to the absolute path of the FashionRec dataset (e.g., /home/user/data/FashionRec).
Update PROXY to match your proxy server, or remove it if no proxy is used.

Step 3: Install Dependencies

Install the required Python packages:

pip install -r requirements.txt

Step 4: Run the Application

Follow these steps to start the FashionM3 application:

1. Start the Fashion VLM MCP Server:

Run the MCP server for the fashion vision-language model:

python mcp_servers/fashion_vlm/main.py

Ensure the server starts successfully and remains running.

2. Start the FashionM3 Client:

Launch the Chainlit client to interact with the Fashion Assistant:

chainlit run chainlit_app.py --port 8888

3. Interact with the Fashion Assistant:

Open your browser and navigate to:

http://localhost:8888/

This will load the FashionM3 interface, allowing you to interact with the Fashion Assistant.

Citation

If you find this work helpful, please consider citing our paper:

@article{pang2025fashionm3,
  title={FashionM3: Multimodal, Multitask, and Multiround Fashion Assistant based on Unified Vision-Language Model},
  author={Pang, Kaicheng and Zou, Xingxing and Wong, Waikeung},
  journal={arXiv preprint arXiv:2504.17826},
  year={2025}
}