vascx

File size: 3,994 Bytes

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "\n",
    "import pandas as pd\n",
    "\n",
    "from rtnls_fundusprep.preprocessor import parallel_preprocess"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Preprocessing\n",
    "\n",
    "This code will preprocess the images and write .png files with the square fundus image and the contrast enhanced version\n",
    "\n",
    "This step is not strictly necessary, but it is useful if you want to run the preprocessing step separately before model inference\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Create a list of files to be preprocessed:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds_path = Path(\"../samples/fundus\")\n",
    "files = list((ds_path / \"original\").glob(\"*\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Images with .dcm extension will be read as dicom and the pixel_array will be read as RGB. All other images will be read using PIL's Image.open"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "0it [00:00, ?it/s][Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.\n",
      "6it [00:00, 154.80it/s]\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Error with image ../samples/fundus/original/HRF_07_dr.jpg\n",
      "Error with image ../samples/fundus/original/HRF_04_g.jpg\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "[Parallel(n_jobs=4)]: Done   2 out of   6 | elapsed:    0.9s remaining:    1.8s\n",
      "[Parallel(n_jobs=4)]: Done   3 out of   6 | elapsed:    1.5s remaining:    1.5s\n",
      "[Parallel(n_jobs=4)]: Done   4 out of   6 | elapsed:    1.5s remaining:    0.8s\n",
      "[Parallel(n_jobs=4)]: Done   6 out of   6 | elapsed:    1.6s finished\n"
     ]
    }
   ],
   "source": [
    "bounds = parallel_preprocess(\n",
    "    files,  # List of image files\n",
    "    rgb_path=ds_path / \"rgb\",  # Output path for RGB images\n",
    "    ce_path=ds_path / \"ce\",  # Output path for Contrast Enhanced images\n",
    "    n_jobs=4,  # number of preprocessing workers\n",
    ")\n",
    "df_bounds = pd.DataFrame(bounds).set_index(\"id\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The preprocessor will produce RGB and contrast-enhanced preprocessed images cropped to a square and return a dataframe with the image bounds that can be used to reconstruct the original image. Output files will be named the same as input images, but with .png extension. Be careful with providing multiple inputs with the same filename without extension as this will result in over-written images. Any exceptions during pre-processing will not stop execution but will print error. Images that failed pre-processing for any reason will be marked with `success=False` in the df_bounds dataframe."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_bounds.to_csv(ds_path / \"meta.csv\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "retinalysis",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}