Luis Oala commited on
Commit
3def822
unverified
1 Parent(s): a3dc7f0

cvpr anon #1

Browse files
Files changed (1) hide show
  1. README.md +18 -61
README.md CHANGED
@@ -1,24 +1,19 @@
1
  [![MIT License](https://img.shields.io/apm/l/atomic-design-ui.svg?)](https://github.com/tterb/atomic-design-ui/blob/master/LICENSEs) [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.5235536.svg)](https://doi.org/10.5281/zenodo.5235536)
2
 
3
- # From Lens to Logit - Addressing Camera Hardware-Drift Using Raw Sensor Data
4
 
5
- <!-- [**Manuscript**](https://openreview.net/forum?id=DRAywM1BhU) | -->
6
- [**Project site**](https://aiaudit.org/lens2logit/) | [**Data**](https://doi.org/10.5281/zenodo.5235536)
7
-
8
- <!--*This repository hosts the code for the project ["From Lens to Logit: Addressing Camera Hardware-Drift Using Raw Sensor Data"](https://openreview.net/forum?id=DRAywM1BhU), submitted to the NeurIPS 2021 Datasets and Benchmarks Track.* -->
9
 
10
  ## A short introduction
11
- In order to address camera hardware-drift we require two ingredients: raw sensor data and an image processing model. This code repository contains the materials for the second ingredient, the image processing model, as well as scripts to load lada and run experiments. For a conceptual overview of the project we recommend the [project site](https://aiaudit.org/lens2logit/). <!-- or the [full paper](https://openreview.net/forum?id=DRAywM1BhU). -->
12
-
13
- ![L2L Overview](https://user-images.githubusercontent.com/38631399/131536063-585cf9b0-e76e-4e41-a05e-2fcf4902f539.png)
14
 
 
15
 
16
- To create an image, raw sensor data traverses complex image signal processing pipelines. These pipelines are used by cameras and scientific instruments to produce the images fed into machine learning systems. The processing pipelines vary by device, influencing the resulting image statistics and ultimately contributing to what is known as hardware-drift. However, this processing is rarely considered in machine learning modelling, because available benchmark data sets are generally not in raw format. Here we show that pairing qualified raw sensor data with an explicit, differentiable model of the image processing pipeline allows to tackle camera hardware-drift.
17
 
18
- Specifically, we demonstrate
19
- 1. the **controlled synthesis of hardware-drift test cases**
20
- 2. modular **hardware-drift forensics**, as well as
21
- 3. **image processing customization**.
22
 
23
  We make available two data sets.
24
  1. **Raw-Microscopy**, contains
@@ -57,59 +52,53 @@ We noticed that PyPi package for `segmentation_models_pytorch` is sometimes behi
57
  $ python -m pip install git+https://github.com/qubvel/segmentation_models.pytorch
58
  ```
59
  #### mlflow tracking
60
- Note that we are maintaining a collaborative [virtual lab log](http://deplo-mlflo-1ssxo94f973sj-890390d809901dbf.elb.eu-central-1.amazonaws.com/#/). By default, anyone has read access to e.g. browse results and fetch trained, stored models. Writing to the server is only possible by specifying the AWS key in `train.py`:
61
- ```python
62
- mlflow.set_tracking_uri(args.tracking_uri)
63
- mlflow.set_experiment(args.experiment_name)
64
- os.environ['AWS_ACCESS_KEY_ID'] = '#TODO: fill in your aws access key id for mlflow server here'
65
- os.environ['AWS_SECRET_ACCESS_KEY'] = '#TODO: fill in your aws secret access key for mlflow server here'
66
- ```
67
- If you would also like to write results to the server please request an AWS key to use. Else you can also disable mlflow tracking and use `train.py` without the virtual lab log.
68
  ### Recreate experiments
69
- The central file for using the **Lens2Logit** framework for experiments as in the paper is `train.py` which provides a rich set of arguments to experiment with raw image data, different image processing models and task models for regression or classification. Below we provide three example prompts for the type of experiments reported in the [paper](https://openreview.net/forum?id=DRAywM1BhU)
70
- #### Controlled synthesis of hardware-drift test cases
 
71
  ```console
72
  $ python train.py \
73
  --experiment_name YOUR-EXPERIMENT-NAME \
74
  --run_name YOUR-RUN-NAME \
75
  --dataset Microscopy \
 
76
  --lr 1e-5 \
77
  --n_splits 5 \
78
  --epochs 5 \
79
  --classifier_pretrained \
80
- --processing_mode static \
81
  --augmentation weak \
82
  --log_model True \
83
  --iso 0.01 \
84
- --freeze_processor \
85
  --track_processing \
86
  --track_every_epoch \
87
  --track_predictions \
88
  --track_processing_gradients \
89
  --track_save_tensors \
90
  ```
91
- #### Modular hardware-drift forensics
92
  ```console
93
  $ python train.py \
94
  --experiment_name YOUR-EXPERIMENT-NAME \
95
  --run_name YOUR-RUN-NAME \
96
  --dataset Microscopy \
97
- --adv_training
98
  --lr 1e-5 \
99
  --n_splits 5 \
100
  --epochs 5 \
101
  --classifier_pretrained \
102
- --processing_mode parametrized \
103
  --augmentation weak \
104
  --log_model True \
105
  --iso 0.01 \
 
106
  --track_processing \
107
  --track_every_epoch \
108
  --track_predictions \
109
  --track_processing_gradients \
110
  --track_save_tensors \
111
  ```
112
- #### Image processing customization
113
  ```console
114
  $ python train.py \
115
  --experiment_name YOUR-EXPERIMENT-NAME \
@@ -129,35 +118,3 @@ $ python train.py \
129
  --track_processing_gradients \
130
  --track_save_tensors \
131
  ```
132
- ## Virtual lab log
133
- We maintain a collaborative virtual lab log at [this address](http://deplo-mlflo-1ssxo94f973sj-890390d809901dbf.elb.eu-central-1.amazonaws.com/#/). There you can browse experiment runs, analyze results through SQL queries and download trained processing and task models.
134
- ![mlflow](https://user-images.githubusercontent.com/38631399/131536233-f6b6e0ae-35f2-4ee0-a5e2-d04f8efb8d73.png)
135
-
136
-
137
- ### Review our experiments
138
- Experiments are listed in the left column. You can select individual runs or compare metrics and parameters across different runs. For runs where we tracked images of intermediate processing steps and images of the gradients at these processing steps you can find at the bottom of a run page in the *results* folder for each epoch. For better overview we include a map between the names of experiments in the paper and names of experiments in the virtual lab log:
139
-
140
- | Name of experiment in paper | Name of experiment in virtual lab log |
141
- | :-------------: | :-----:|
142
- | 5.1 Controlled synthesis of hardware-drift test cases | 1 Controlled synthesis of hardware-drift test cases (Train) , 1 Controlled synthesis of hardware-drift test cases (Test)|
143
- | 5.2 Modular hardware-drift forensics | 2 Modular hardware-drift forensics |
144
- | 5.3 Image processing customization | 3 Image processing customization (Microscopy), 3 Image processing customization (Drone) |
145
-
146
- Not that the virtual lab log includes many additional experiments.
147
-
148
- ### Use our trained models
149
- When selecting a run and a model was saved you can find the model files, state dictionary and instructions to load at the bottom of a run page under *models*. In the menu bar at the top of the virtual lab log you can also access models via the *Model Registry*. Our code is well integrated with the *mlflow* autologging and -loading package for PyTorch. So when using our code you can just specify the *model uri* as an argument and models will be fetched from the model registry automatically.
150
-
151
- ## Author details
152
-
153
- @article{oala_aversa_lens_2021,
154
- title = {From Lens to Logit: Addressing Camera Hardware-Drift Using Raw Sensor Data},
155
- shorttitle = {From Lens to Logit},
156
- url = {https://aiaudit.org/lens2logit/},
157
- urldate = {2021-08-28},
158
- author = {Oala, Luis and Aversa, Marco and Willis, Kurt and Nobis, Gabriel and Pomarico,
159
- Enrico and Neuenschwander, Yoan and Extermann, J茅r么me and Buck, Mich猫le and Matek, Christian and
160
- Clausen, Christoph and Murray-Smith, Roderick and Sanguinetti, Bruno},
161
- month = aug,
162
- year = {2021}
163
- }
 
1
  [![MIT License](https://img.shields.io/apm/l/atomic-design-ui.svg?)](https://github.com/tterb/atomic-design-ui/blob/master/LICENSEs) [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.5235536.svg)](https://doi.org/10.5281/zenodo.5235536)
2
 
3
+ # Dataset Drift Controls Using Raw Image Data and Differentiable ISPs: From Raw to Logit
4
 
5
+ *This anonymous repository hosts the code for manuscript #4471 "Dataset Drift Controls Using Raw Image Data and Differentiable ISPs: From Raw to Logit", submitted to CVPR 2022.*
 
 
 
6
 
7
  ## A short introduction
8
+ Two ingredients are required for the **Raw2Logit** dataset drift controls: raw sensor data and an image processing model. This code repository contains the materials for the second ingredient, the image processing model, as well as scripts to load lada and run experiments.
 
 
9
 
10
+ ![R2L Overview](https://github.com/luisoala/raw2logit/blob/master/pmflow8.png)
11
 
12
+ To create an image, raw sensor data traverses complex image signal processing (ISP) pipelines. These pipelines are used by cameras and scientific instruments to produce the images fed into machine learning systems. The processing pipelines vary by device, influencing the resulting image statistics and ultimately contributing to dataset drift. However, this processing is rarely considered in machine learning modelling. In this study, we examine the role raw sensor data and differentiable processing models can play in controlling performance risks related to dataset drift. The findings are distilled into three applications.
13
 
14
+ 1. **Drift forensics** can be used to isolate performance-sensitive data processing configurations which should be avoided during deployment of a machine learning model
15
+ 2. **Drift synthesis** enables the controlled generation of drift test cases. The experiments presented here show that the average decrease in model performance is ten to four times less severe than under post-hoc perturbation testing
16
+ 3. **Drift adjustment** opens up the possibility for processing adjustments in the face of drift
 
17
 
18
  We make available two data sets.
19
  1. **Raw-Microscopy**, contains
 
52
  $ python -m pip install git+https://github.com/qubvel/segmentation_models.pytorch
53
  ```
54
  #### mlflow tracking
55
+ Note that we are maintaining a collaborative mlflow virtual lab server. The tracking API is integrated into the code. By default, anyone has read access to e.g. browse results and fetch trained, stored models. For the purpose of anonymization the link to the tracking server info is removed here as it contains identfiable information of persons who submitted jobs. You can setup your own mlflow server for the purposes of this anonymized version of code or disable mlflow tracking and use `train.py` without the virtual lab log.
 
 
 
 
 
 
 
56
  ### Recreate experiments
57
+ The central file for using the **Raw2Logit** framework for experiments as in the paper is `train.py` which provides a rich set of arguments to experiment with raw image data, different image processing models and task models for regression or classification. Below we provide three example prompts for the type of experiments reported in the [paper](https://openreview.net/forum?id=DRAywM1BhU)
58
+
59
+ #### Drift forensics
60
  ```console
61
  $ python train.py \
62
  --experiment_name YOUR-EXPERIMENT-NAME \
63
  --run_name YOUR-RUN-NAME \
64
  --dataset Microscopy \
65
+ --adv_training
66
  --lr 1e-5 \
67
  --n_splits 5 \
68
  --epochs 5 \
69
  --classifier_pretrained \
70
+ --processing_mode parametrized \
71
  --augmentation weak \
72
  --log_model True \
73
  --iso 0.01 \
 
74
  --track_processing \
75
  --track_every_epoch \
76
  --track_predictions \
77
  --track_processing_gradients \
78
  --track_save_tensors \
79
  ```
80
+ #### Drift synthesis
81
  ```console
82
  $ python train.py \
83
  --experiment_name YOUR-EXPERIMENT-NAME \
84
  --run_name YOUR-RUN-NAME \
85
  --dataset Microscopy \
 
86
  --lr 1e-5 \
87
  --n_splits 5 \
88
  --epochs 5 \
89
  --classifier_pretrained \
90
+ --processing_mode static \
91
  --augmentation weak \
92
  --log_model True \
93
  --iso 0.01 \
94
+ --freeze_processor \
95
  --track_processing \
96
  --track_every_epoch \
97
  --track_predictions \
98
  --track_processing_gradients \
99
  --track_save_tensors \
100
  ```
101
+ #### Drfit adjustments
102
  ```console
103
  $ python train.py \
104
  --experiment_name YOUR-EXPERIMENT-NAME \
 
118
  --track_processing_gradients \
119
  --track_save_tensors \
120
  ```