# Evaluation We provide a unified evaluation script that runs baselines on multiple benchmarks. It takes a baseline model and evaluation configurations, evaluates on-the-fly, and reports results instantly in a JSON file. ## Benchmarks Donwload the processed datasets from [Huggingface Datasets](https://huggingface.co/datasets/Ruicheng/monocular-geometry-evaluation) and put them in the `data/eval` directory, using `huggingface-cli`: ```bash mkdir -p data/eval huggingface-cli download Ruicheng/monocular-geometry-evaluation --repo-type dataset --local-dir data/eval --local-dir-use-symlinks False ``` Then unzip the downloaded files: ```bash cd data/eval unzip '*.zip' # rm *.zip # if you don't keep the zip files ``` ## Configuration See [`configs/eval/all_benchmarks.json`](../configs/eval/all_benchmarks.json) for an example of evaluation configurations on all benchmarks. You can modify this file to evaluate on different benchmarks or different baselines. ## Baseline Some examples of baselines are provided in [`baselines/`](../baselines/). Pass the path to the baseline model python code to the `--baseline` argument of the evaluation script. ## Run Evaluation Run the script [`moge/scripts/eval_baseline.py`](../moge/scripts/eval_baseline.py). For example, ```bash # Evaluate MoGe on the 10 benchmarks python moge/scripts/eval_baseline.py --baseline baselines/moge.py --config configs/eval/all_benchmarks.json --output eval_output/moge.json --pretrained Ruicheng/moge-vitl --resolution_level 9 # Evaluate Depth Anything V2 on the 10 benchmarks. (NOTE: affine disparity) python moge/scripts/eval_baseline.py --baseline baselines/da_v2.py --config configs/eval/all_benchmarks.json --output eval_output/da_v2.json ``` The `--baselies` `--input` `--output` arguments are for the inference script. The rest arguments, e.g. `--pretrained` `--resolution_level`, are custormized for loading the baseline model. Details of the arguments: ``` Usage: eval_baseline.py [OPTIONS] Evaluation script. Options: --baseline PATH Path to the baseline model python code. --config PATH Path to the evaluation configurations. Defaults to "configs/eval/all_benchmarks.json". --output PATH Path to the output json file. --oracle Use oracle mode for evaluation, i.e., use the GT intrinsics input. --dump_pred Dump predition results. --dump_gt Dump ground truth. --help Show this message and exit. ``` ## Wrap a Customized Baseline Wrap any baseline method with [`moge.test.baseline.MGEBaselineInterface`](../moge/test/baseline.py). See [`baselines/`](../baselines/) for more examples. It is a good idea to check the correctness of the baseline implementation by running inference on a small set of images via [`moge/scripts/infer_baselines.py`](../moge/scripts/infer_baselines.py): ```base python moge/scripts/infer_baselines.py --baseline baselines/moge.py --input example_images/ --output infer_outupt/moge --pretrained Ruicheng/moge-vitl --maps --ply ```