Spaces:

radames
/

UserControllableLT-Latent-Transformer

Runtime error

App Files Files Community

endo-yuki-t commited on Aug 26, 2022

Commit

d7dbcdd

0 Parent(s):

initial commit

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

LICENSE +21 -0
README.md +51 -0
criteria/__init__.py +0 -0
criteria/lpips/__init__.py +0 -0
criteria/lpips/lpips.py +35 -0
criteria/lpips/networks.py +96 -0
criteria/lpips/utils.py +30 -0
docs/teaser.jpg +0 -0
docs/thumb.gif +0 -0
env.yaml +380 -0
expansion/__init__.py +0 -0
expansion/dataloader/__init__.py +0 -0
expansion/dataloader/__pycache__/__init__.cpython-38.pyc +0 -0
expansion/dataloader/__pycache__/seqlist.cpython-38.pyc +0 -0
expansion/dataloader/chairslist.py +33 -0
expansion/dataloader/chairssdlist.py +30 -0
expansion/dataloader/depth_transforms.py +471 -0
expansion/dataloader/depthloader.py +222 -0
expansion/dataloader/flow_transforms.py +440 -0
expansion/dataloader/hd1klist.py +29 -0
expansion/dataloader/kitti12list.py +29 -0
expansion/dataloader/kitti15list.py +29 -0
expansion/dataloader/kitti15list_train.py +31 -0
expansion/dataloader/kitti15list_train_lidar.py +34 -0
expansion/dataloader/kitti15list_val.py +31 -0
expansion/dataloader/kitti15list_val_lidar.py +34 -0
expansion/dataloader/kitti15list_val_mr.py +41 -0
expansion/dataloader/robloader.py +133 -0
expansion/dataloader/sceneflowlist.py +51 -0
expansion/dataloader/seqlist.py +26 -0
expansion/dataloader/sintellist.py +32 -0
expansion/dataloader/sintellist_clean.py +31 -0
expansion/dataloader/sintellist_final.py +32 -0
expansion/dataloader/sintellist_train.py +32 -0
expansion/dataloader/sintellist_val.py +34 -0
expansion/dataloader/thingslist.py +122 -0
expansion/models/VCN_exp.py +561 -0
expansion/models/__init__.py +0 -0
expansion/models/__pycache__/VCN_exp.cpython-38.pyc +0 -0
expansion/models/__pycache__/__init__.cpython-38.pyc +0 -0
expansion/models/__pycache__/conv4d.cpython-38.pyc +0 -0
expansion/models/__pycache__/submodule.cpython-38.pyc +0 -0
expansion/models/conv4d.py +296 -0
expansion/models/submodule.py +450 -0
expansion/submission.py +95 -0
expansion/utils/__init__.py +0 -0
expansion/utils/__pycache__/__init__.cpython-38.pyc +0 -0
expansion/utils/__pycache__/flowlib.cpython-38.pyc +0 -0
expansion/utils/__pycache__/io.cpython-38.pyc +0 -0
expansion/utils/__pycache__/pfm.cpython-38.pyc +0 -0

LICENSE ADDED Viewed

	@@ -0,0 +1,21 @@

+MIT License
+Copyright (c) 2022 Yuki Endo
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

README.md ADDED Viewed

	@@ -0,0 +1,51 @@

+# User-Controllable Latent Transformer for StyleGAN Image Layout Editing
+  <!--a href="https://arxiv.org/abs/2103.14877"><img src="https://img.shields.io/badge/arXiv-2103.14877-b31b1b.svg"></a-->
+  <a href="https://opensource.org/licenses/MIT"><img src="https://img.shields.io/badge/License-MIT-yellow.svg"></a>
+<p align="center">
+<img src="docs/teaser.jpg" width="800px"/>
+</p>
+This repository contains our implementation of the following paper:
+Yuki Endo: "User-Controllable Latent Transformer for StyleGAN Image Layout Editing," Computer Graphpics Forum (Pacific Graphics 2022) [[Project](http://www.cgg.cs.tsukuba.ac.jp/~endo/projects/UserControllableLT)] [[PDF (preprint)]()]
+## Prerequisites
+1. Python 3.8
+2. PyTorch 1.9.0
+3. Flask
+4. Others (see env.yml)
+## Preparation
+Download and decompress <a href="https://drive.google.com/file/d/1lBL_J-uROvqZ0BYu9gmEcMCNyaPo9cBY/view?usp=sharing">our pre-trained models</a>.
+## Inference with our pre-trained models
+<img src="docs/thumb.gif" width="150px"/><br>
+We provide an interactive interface based on Flask. This interface can be locally launched with
+```
+python interface/flask_app.py --checkpoint_path=pretrained_models/latent_transformer/cat.pt
+```
+The interface can be accessed via http://localhost:8000/.
+## Training
+The latent transformer can be trained with
+```
+python scripts/train.py --exp_dir=results --stylegan_weights=pretrained_models/stylegan2-cat-config-f.pt
+```
+## Citation
+Please cite our paper if you find the code useful:
+```
+@Article{endoPG2022,
+Title = {User-Controllable Latent Transformer for StyleGAN Image Layout Editing},
+Author = {Yuki Endo},
+Journal = {Computer Graphics Forum},
+volume = {},
+number = {},
+pages = {},
+doi = {},
+Year = {2022}
+}
+```
+## Acknowledgements
+This code heavily borrows from the [pixel2style2pixel](https://github.com/eladrich/pixel2style2pixel) and [expansion](https://github.com/gengshan-y/expansion) repositories.

criteria/__init__.py ADDED Viewed

File without changes

criteria/lpips/__init__.py ADDED Viewed

File without changes

criteria/lpips/lpips.py ADDED Viewed

	@@ -0,0 +1,35 @@

+import torch
+import torch.nn as nn
+from criteria.lpips.networks import get_network, LinLayers
+from criteria.lpips.utils import get_state_dict
+class LPIPS(nn.Module):
+    r"""Creates a criterion that measures
+    Learned Perceptual Image Patch Similarity (LPIPS).
+    Arguments:
+        net_type (str): the network type to compare the features:
+                        'alex' | 'squeeze' | 'vgg'. Default: 'alex'.
+        version (str): the version of LPIPS. Default: 0.1.
+    """
+    def __init__(self, net_type: str = 'alex', version: str = '0.1'):
+        assert version in ['0.1'], 'v0.1 is only supported now'
+        super(LPIPS, self).__init__()
+        # pretrained network
+        self.net = get_network(net_type).to("cuda")
+        # linear layers
+        self.lin = LinLayers(self.net.n_channels_list).to("cuda")
+        self.lin.load_state_dict(get_state_dict(net_type, version))
+    def forward(self, x: torch.Tensor, y: torch.Tensor):
+        feat_x, feat_y = self.net(x), self.net(y)
+        diff = [(fx - fy) ** 2 for fx, fy in zip(feat_x, feat_y)]
+        res = [l(d).mean((2, 3), True) for d, l in zip(diff, self.lin)]
+        return torch.sum(torch.cat(res, 0)) / x.shape[0]

criteria/lpips/networks.py ADDED Viewed

	@@ -0,0 +1,96 @@

+from typing import Sequence
+from itertools import chain
+import torch
+import torch.nn as nn
+from torchvision import models
+from criteria.lpips.utils import normalize_activation
+def get_network(net_type: str):
+    if net_type == 'alex':
+        return AlexNet()
+    elif net_type == 'squeeze':
+        return SqueezeNet()
+    elif net_type == 'vgg':
+        return VGG16()
+    else:
+        raise NotImplementedError('choose net_type from [alex, squeeze, vgg].')
+class LinLayers(nn.ModuleList):
+    def __init__(self, n_channels_list: Sequence[int]):
+        super(LinLayers, self).__init__([
+            nn.Sequential(
+                nn.Identity(),
+                nn.Conv2d(nc, 1, 1, 1, 0, bias=False)
+            ) for nc in n_channels_list
+        ])
+        for param in self.parameters():
+            param.requires_grad = False
+class BaseNet(nn.Module):
+    def __init__(self):
+        super(BaseNet, self).__init__()
+        # register buffer
+        self.register_buffer(
+            'mean', torch.Tensor([-.030, -.088, -.188])[None, :, None, None])
+        self.register_buffer(
+            'std', torch.Tensor([.458, .448, .450])[None, :, None, None])
+    def set_requires_grad(self, state: bool):
+        for param in chain(self.parameters(), self.buffers()):
+            param.requires_grad = state
+    def z_score(self, x: torch.Tensor):
+        return (x - self.mean) / self.std
+    def forward(self, x: torch.Tensor):
+        x = self.z_score(x)
+        output = []
+        for i, (_, layer) in enumerate(self.layers._modules.items(), 1):
+            x = layer(x)
+            if i in self.target_layers:
+                output.append(normalize_activation(x))
+            if len(output) == len(self.target_layers):
+                break
+        return output
+class SqueezeNet(BaseNet):
+    def __init__(self):
+        super(SqueezeNet, self).__init__()
+        self.layers = models.squeezenet1_1(True).features
+        self.target_layers = [2, 5, 8, 10, 11, 12, 13]
+        self.n_channels_list = [64, 128, 256, 384, 384, 512, 512]
+        self.set_requires_grad(False)
+class AlexNet(BaseNet):
+    def __init__(self):
+        super(AlexNet, self).__init__()
+        self.layers = models.alexnet(True).features
+        self.target_layers = [2, 5, 8, 10, 12]
+        self.n_channels_list = [64, 192, 384, 256, 256]
+        self.set_requires_grad(False)
+class VGG16(BaseNet):
+    def __init__(self):
+        super(VGG16, self).__init__()
+        self.layers = models.vgg16(True).features
+        self.target_layers = [4, 9, 16, 23, 30]
+        self.n_channels_list = [64, 128, 256, 512, 512]
+        self.set_requires_grad(False)

criteria/lpips/utils.py ADDED Viewed

	@@ -0,0 +1,30 @@

+from collections import OrderedDict
+import torch
+def normalize_activation(x, eps=1e-10):
+    norm_factor = torch.sqrt(torch.sum(x ** 2, dim=1, keepdim=True))
+    return x / (norm_factor + eps)
+def get_state_dict(net_type: str = 'alex', version: str = '0.1'):
+    # build url
+    url = 'https://raw.githubusercontent.com/richzhang/PerceptualSimilarity/' \
+        + f'master/lpips/weights/v{version}/{net_type}.pth'
+    # download
+    old_state_dict = torch.hub.load_state_dict_from_url(
+        url, progress=True,
+        map_location=None if torch.cuda.is_available() else torch.device('cpu')
+    )
+    # rename keys
+    new_state_dict = OrderedDict()
+    for key, val in old_state_dict.items():
+        new_key = key
+        new_key = new_key.replace('lin', '')
+        new_key = new_key.replace('model.', '')
+        new_state_dict[new_key] = val
+    return new_state_dict

docs/teaser.jpg ADDED Viewed

docs/thumb.gif ADDED Viewed

env.yaml ADDED Viewed

	@@ -0,0 +1,380 @@

+name: uclt
+channels:
+  - pytorch
+  - anaconda
+  - nvidia
+  - conda-forge
+  - defaults
+dependencies:
+  - _ipyw_jlab_nb_ext_conf=0.1.0=py38_0
+  - _libgcc_mutex=0.1=conda_forge
+  - _openmp_mutex=4.5=1_llvm
+  - absl-py=0.13.0=pyhd8ed1ab_0
+  - aiohttp=3.7.4.post0=py38h497a2fe_0
+  - albumentations=1.0.3=pyhd8ed1ab_0
+  - alsa-lib=1.2.3=h516909a_0
+  - anaconda-client=1.8.0=py38h06a4308_0
+  - anaconda-navigator=2.0.4=py38_0
+  - anyio=2.2.0=py38h06a4308_1
+  - appdirs=1.4.4=pyh9f0ad1d_0
+  - argon2-cffi=20.1.0=py38h27cfd23_1
+  - async-timeout=3.0.1=py_1000
+  - async_generator=1.10=pyhd3eb1b0_0
+  - attrs=21.2.0=pyhd3eb1b0_0
+  - babel=2.9.1=pyhd3eb1b0_0
+  - backcall=0.2.0=pyhd3eb1b0_0
+  - backports=1.0=pyhd3eb1b0_2
+  - backports.functools_lru_cache=1.6.4=pyhd3eb1b0_0
+  - backports.tempfile=1.0=pyhd3eb1b0_1
+  - backports.weakref=1.0.post1=py_1
+  - beautifulsoup4=4.9.3=pyha847dfd_0
+  - blas=1.0=mkl
+  - bleach=4.0.0=pyhd3eb1b0_0
+  - blinker=1.4=py_1
+  - brotli=1.0.9=h7f98852_5
+  - brotli-bin=1.0.9=h7f98852_5
+  - brotlipy=0.7.0=py38h27cfd23_1003
+  - bzip2=1.0.8=h7b6447c_0
+  - c-ares=1.17.1=h27cfd23_0
+  - ca-certificates=2021.10.8=ha878542_0
+  - cachetools=4.2.2=pyhd8ed1ab_0
+  - cairo=1.16.0=hf32fb01_1
+  - certifi=2021.10.8=py38h578d9bd_1
+  - cffi=1.14.6=py38h400218f_0
+  - chardet=4.0.0=py38h06a4308_1003
+  - click=8.0.1=pyhd3eb1b0_0
+  - cloudpickle=1.6.0=py_0
+  - clyent=1.2.2=py38_1
+  - conda=4.11.0=py38h578d9bd_0
+  - conda-build=3.21.4=py38h06a4308_0
+  - conda-content-trust=0.1.1=pyhd3eb1b0_0
+  - conda-env=2.6.0=1
+  - conda-package-handling=1.7.3=py38h27cfd23_1
+  - conda-repo-cli=1.0.4=pyhd3eb1b0_0
+  - conda-token=0.3.0=pyhd3eb1b0_0
+  - conda-verify=3.4.2=py_1
+  - cryptography=3.4.7=py38hd23ed53_0
+  - cudatoolkit=11.1.74=h6bb024c_0
+  - cycler=0.10.0=py_2
+  - cytoolz=0.11.0=py38h497a2fe_3
+  - dask-core=2021.8.1=pyhd8ed1ab_0
+  - dbus=1.13.18=hb2f20db_0
+  - decorator=5.0.9=pyhd3eb1b0_0
+  - defusedxml=0.7.1=pyhd3eb1b0_0
+  - dill=0.3.4=pyhd8ed1ab_0
+  - dominate=2.6.0=pyhd8ed1ab_0
+  - entrypoints=0.3=py38_0
+  - enum34=1.1.10=py38h32f6830_2
+  - expat=2.4.1=h2531618_2
+  - ffmpeg=4.3.2=hca11adc_0
+  - filelock=3.0.12=pyhd3eb1b0_1
+  - flask=1.1.2=pyh9f0ad1d_0
+  - flask-httpauth=4.4.0=pyhd8ed1ab_0
+  - fontconfig=2.13.1=h6c09931_0
+  - fonttools=4.25.0=pyhd3eb1b0_0
+  - freetype=2.10.4=h5ab3b9f_0
+  - fsspec=2021.7.0=pyhd8ed1ab_0
+  - ftfy=6.0.3=pyhd8ed1ab_0
+  - func_timeout=4.3.5=py_0
+  - future=0.18.2=py38_1
+  - gdown=4.2.0=pyhd8ed1ab_0
+  - geos=3.10.0=h9c3ff4c_0
+  - gettext=0.19.8.1=h0b5b191_1005
+  - git=2.23.0=pl526hacde149_0
+  - glib=2.68.4=h9c3ff4c_0
+  - glib-tools=2.68.4=h9c3ff4c_0
+  - glob2=0.7=pyhd3eb1b0_0
+  - gmp=6.2.1=h58526e2_0
+  - gnutls=3.6.13=h85f3911_1
+  - google-auth=1.35.0=pyh6c4a22f_0
+  - google-auth-oauthlib=0.4.5=pyhd8ed1ab_0
+  - gputil=1.4.0=pyh9f0ad1d_0
+  - graphite2=1.3.13=h58526e2_1001
+  - gst-plugins-base=1.18.4=hf529b03_2
+  - gstreamer=1.18.4=h76c114f_2
+  - harfbuzz=2.9.0=h83ec7ef_0
+  - hdf5=1.10.6=nompi_h6a2412b_1114
+  - icu=68.1=h58526e2_0
+  - idna=2.10=pyhd3eb1b0_0
+  - imagecodecs-lite=2019.12.3=py38h5c078b8_3
+  - imageio=2.9.0=py_0
+  - imageio-ffmpeg=0.4.5=pyhd8ed1ab_0
+  - imgaug=0.4.0=py_1
+  - importlib-metadata=3.10.0=py38h06a4308_0
+  - importlib_metadata=3.10.0=hd3eb1b0_0
+  - intel-openmp=2021.3.0=h06a4308_3350
+  - ipykernel=5.3.4=py38h5ca1d4c_0
+  - ipympl=0.8.2=pyhd8ed1ab_0
+  - ipython=7.26.0=py38hb070fc8_0
+  - ipython_genutils=0.2.0=pyhd3eb1b0_1
+  - ipywidgets=7.6.3=pyhd3eb1b0_1
+  - itsdangerous=2.0.1=pyhd3eb1b0_0
+  - jasper=1.900.1=h07fcdf6_1006
+  - jedi=0.18.0=py38h06a4308_1
+  - jinja2=2.11.3=pyhd3eb1b0_0
+  - joblib=1.1.0=pyhd8ed1ab_0
+  - jpeg=9d=h36c2ea0_0
+  - json5=0.9.6=pyhd3eb1b0_0
+  - jsonnet=0.17.0=py38hadf7658_0
+  - jsonschema=3.2.0=py_2
+  - jupyter_client=6.1.12=pyhd3eb1b0_0
+  - jupyter_core=4.7.1=py38h06a4308_0
+  - jupyter_server=1.4.1=py38h06a4308_0
+  - jupyterlab=3.1.7=pyhd3eb1b0_0
+  - jupyterlab_pygments=0.1.2=py_0
+  - jupyterlab_server=2.7.1=pyhd3eb1b0_0
+  - jupyterlab_widgets=1.0.0=pyhd3eb1b0_1
+  - kiwisolver=1.3.1=py38h1fd1430_1
+  - krb5=1.19.2=hcc1bbae_0
+  - lame=3.100=h7f98852_1001
+  - lcms2=2.12=h3be6417_0
+  - ld_impl_linux-64=2.35.1=h7274673_9
+  - libarchive=3.4.2=h62408e4_0
+  - libblas=3.9.0=11_linux64_mkl
+  - libbrotlicommon=1.0.9=h7f98852_5
+  - libbrotlidec=1.0.9=h7f98852_5
+  - libbrotlienc=1.0.9=h7f98852_5
+  - libcblas=3.9.0=11_linux64_mkl
+  - libcurl=7.78.0=h2574ce0_0
+  - libedit=3.1.20191231=he28a2e2_2
+  - libev=4.33=h516909a_1
+  - libevent=2.1.10=hcdb4288_3
+  - libffi=3.3=he6710b0_2
+  - libgcc-ng=11.1.0=hc902ee8_8
+  - libgfortran-ng=11.1.0=h69a702a_8
+  - libgfortran5=11.1.0=h6c583b3_8
+  - libglib=2.68.4=h3e27bee_0
+  - libiconv=1.16=h516909a_0
+  - liblapack=3.9.0=11_linux64_mkl
+  - liblapacke=3.9.0=11_linux64_mkl
+  - liblief=0.10.1=he6710b0_0
+  - libllvm11=11.1.0=hf817b99_2
+  - libnghttp2=1.43.0=h812cca2_0
+  - libogg=1.3.4=h7f98852_1
+  - libopencv=4.5.2=py38hcdf9bf1_0
+  - libopus=1.3.1=h7f98852_1
+  - libpng=1.6.37=hbc83047_0
+  - libpq=13.3=hd57d9b9_0
+  - libprotobuf=3.15.8=h780b84a_0
+  - libsodium=1.0.18=h7b6447c_0
+  - libssh2=1.9.0=ha56f1ee_6
+  - libstdcxx-ng=11.1.0=h56837e0_8
+  - libtiff=4.2.0=h85742a9_0
+  - libuuid=1.0.3=h1bed415_2
+  - libuv=1.40.0=h7b6447c_0
+  - libvorbis=1.3.7=h9c3ff4c_0
+  - libwebp-base=1.2.0=h27cfd23_0
+  - libxcb=1.14=h7b6447c_0
+  - libxkbcommon=1.0.3=he3ba5ed_0
+  - libxml2=2.9.12=h72842e0_0
+  - llvm-openmp=12.0.1=h4bd325d_1
+  - locket=0.2.0=py_2
+  - lz4-c=1.9.3=h295c915_1
+  - markdown=3.3.4=pyhd8ed1ab_0
+  - markupsafe=2.0.1=py38h27cfd23_0
+  - matplotlib=3.4.2=py38h578d9bd_0
+  - matplotlib-base=3.4.2=py38hab158f2_0
+  - matplotlib-inline=0.1.2=pyhd3eb1b0_2
+  - mistune=0.8.4=py38h7b6447c_1000
+  - mkl=2021.3.0=h06a4308_520
+  - mkl-service=2.4.0=py38h7f8727e_0
+  - mkl_fft=1.3.0=py38h42c9631_2
+  - mkl_random=1.2.2=py38h51133e4_0
+  - multidict=5.1.0=py38h497a2fe_1
+  - munkres=1.1.4=pyh9f0ad1d_0
+  - mysql-common=8.0.25=ha770c72_0
+  - mysql-libs=8.0.25=h935591d_0
+  - navigator-updater=0.2.1=py38_0
+  - nbclassic=0.2.6=pyhd3eb1b0_0
+  - nbclient=0.5.3=pyhd3eb1b0_0
+  - nbconvert=6.1.0=py38h06a4308_0
+  - nbformat=5.1.3=pyhd3eb1b0_0
+  - ncurses=6.2=he6710b0_1
+  - nest-asyncio=1.5.1=pyhd3eb1b0_0
+  - nettle=3.6=he412f7d_0
+  - networkx=2.3=py_0
+  - ninja=1.10.2=hff7bd54_1
+  - notebook=6.4.3=py38h06a4308_0
+  - nspr=4.30=h9c3ff4c_0
+  - nss=3.69=hb5efdd6_0
+  - numpy=1.20.3=py38hf144106_0
+  - numpy-base=1.20.3=py38h74d4b33_0
+  - oauthlib=3.1.1=pyhd8ed1ab_0
+  - olefile=0.46=py_0
+  - opencv=4.5.2=py38h578d9bd_0
+  - openh264=2.1.1=h780b84a_0
+  - openjpeg=2.3.0=h05c96fa_1
+  - openssl=1.1.1l=h7f98852_0
+  - packaging=21.0=pyhd3eb1b0_0
+  - pandas=1.3.2=py38h43a58ef_0
+  - pandocfilters=1.4.3=py38h06a4308_1
+  - parso=0.8.2=pyhd3eb1b0_0
+  - partd=1.2.0=pyhd8ed1ab_0
+  - patchelf=0.12=h2531618_1
+  - pathlib=1.0.1=py38h578d9bd_4
+  - patsy=0.5.1=py_0
+  - pcre=8.45=h295c915_0
+  - perl=5.26.2=h14c3975_0
+  - pexpect=4.8.0=pyhd3eb1b0_3
+  - pickleshare=0.7.5=pyhd3eb1b0_1003
+  - pillow=8.3.1=py38h2c7a002_0
+  - pip=21.2.2=py38h06a4308_0
+  - pixman=0.40.0=h36c2ea0_0
+  - pkginfo=1.7.1=py38h06a4308_0
+  - pooch=1.5.1=pyhd8ed1ab_0
+  - portalocker=1.7.0=py38h578d9bd_1
+  - prometheus_client=0.11.0=pyhd3eb1b0_0
+  - prompt-toolkit=3.0.17=pyh06a4308_0
+  - protobuf=3.15.8=py38h709712a_0
+  - psutil=5.8.0=py38h27cfd23_1
+  - ptyprocess=0.7.0=pyhd3eb1b0_2
+  - py-lief=0.10.1=py38h403a769_0
+  - py-opencv=4.5.2=py38hd0cf306_0
+  - pyasn1=0.4.8=py_0
+  - pyasn1-modules=0.2.7=py_0
+  - pycosat=0.6.3=py38h7b6447c_1
+  - pycparser=2.20=py_2
+  - pygments=2.10.0=pyhd3eb1b0_0
+  - pyjwt=2.1.0=pyhd8ed1ab_0
+  - pyopenssl=20.0.1=pyhd3eb1b0_1
+  - pyparsing=2.4.7=pyhd3eb1b0_0
+  - pypng=0.0.20=py_0
+  - pyqt=5.12.3=py38h578d9bd_7
+  - pyqt-impl=5.12.3=py38h7400c14_7
+  - pyqt5-sip=4.19.18=py38h709712a_7
+  - pyqtchart=5.12=py38h7400c14_7
+  - pyqtwebengine=5.12.1=py38h7400c14_7
+  - pyrsistent=0.17.3=py38h7b6447c_0
+  - pysocks=1.7.1=py38h06a4308_0
+  - python=3.8.10=h12debd9_8
+  - python-dateutil=2.8.2=pyhd3eb1b0_0
+  - python-libarchive-c=2.9=pyhd3eb1b0_1
+  - python-lmdb=0.99=py38h709712a_0
+  - python_abi=3.8=2_cp38
+  - pytorch=1.9.0=py3.8_cuda11.1_cudnn8.0.5_0
+  - pytz=2021.1=pyhd3eb1b0_0
+  - pyu2f=0.1.5=pyhd8ed1ab_0
+  - pywavelets=1.1.1=py38h5c078b8_3
+  - pyyaml=5.4.1=py38h27cfd23_1
+  - pyzmq=22.2.1=py38h295c915_1
+  - qt=5.12.9=hda022c4_4
+  - qtpy=1.9.0=py_0
+  - readline=8.1=h27cfd23_0
+  - regex=2021.8.28=py38h497a2fe_0
+  - requests=2.25.1=pyhd3eb1b0_0
+  - requests-oauthlib=1.3.0=pyh9f0ad1d_0
+  - ripgrep=12.1.1=0
+  - rsa=4.7.2=pyh44b312d_0
+  - ruamel_yaml=0.15.100=py38h27cfd23_0
+  - scikit-image=0.18.3=py38h43a58ef_0
+  - scikit-learn=1.0=py38hacb3eff_1
+  - scipy=1.7.1=py38h56a6a73_0
+  - seaborn=0.11.2=hd8ed1ab_0
+  - seaborn-base=0.11.2=pyhd8ed1ab_0
+  - send2trash=1.5.0=pyhd3eb1b0_1
+  - setuptools=52.0.0=py38h06a4308_0
+  - shapely=1.8.0=py38hf7953bd_1
+  - sip=4.19.13=py38he6710b0_0
+  - sniffio=1.2.0=py38h06a4308_1
+  - soupsieve=2.2.1=pyhd3eb1b0_0
+  - sqlite=3.36.0=hc218d9a_0
+  - statsmodels=0.12.2=py38h5c078b8_0
+  - tensorboard=2.6.0=pyhd8ed1ab_1
+  - tensorboard-data-server=0.6.0=py38h2b97feb_0
+  - tensorboard-plugin-wit=1.8.0=pyh44b312d_0
+  - tensorboardx=2.4=pyhd8ed1ab_0
+  - terminado=0.9.4=py38h06a4308_0
+  - testpath=0.5.0=pyhd3eb1b0_0
+  - threadpoolctl=3.0.0=pyh8a188c0_0
+  - tifffile=2019.7.26.2=py38_0
+  - tk=8.6.10=hbc83047_0
+  - toolz=0.11.1=py_0
+  - torchfile=0.1.0=py_0
+  - tornado=6.1=py38h27cfd23_0
+  - tqdm=4.62.1=pyhd3eb1b0_1
+  - traitlets=5.0.5=pyhd3eb1b0_0
+  - typing_extensions=3.10.0.0=pyh06a4308_0
+  - urllib3=1.26.6=pyhd3eb1b0_1
+  - wcwidth=0.2.5=py_0
+  - webencodings=0.5.1=py38_1
+  - werkzeug=1.0.1=pyhd3eb1b0_0
+  - wheel=0.37.0=pyhd3eb1b0_0
+  - widgetsnbextension=3.5.1=py38_0
+  - x264=1!161.3030=h7f98852_1
+  - xmltodict=0.12.0=py_0
+  - xz=5.2.5=h7b6447c_0
+  - yacs=0.1.6=py_0
+  - yaml=0.2.5=h7b6447c_0
+  - yarl=1.6.3=py38h497a2fe_2
+  - zeromq=4.3.4=h2531618_0
+  - zipp=3.5.0=pyhd3eb1b0_0
+  - zlib=1.2.11=h7b6447c_3
+  - zstd=1.4.9=haebb681_0
+  - pip:
+    - addict==2.4.0
+    - altair==4.2.0
+    - astor==0.8.1
+    - astunparse==1.6.3
+    - backports-zoneinfo==0.2.1
+    - base58==2.1.1
+    - basicsr==1.3.4.1
+    - boto3==1.18.33
+    - botocore==1.21.33
+    - clang==5.0
+    - clean-fid==0.1.22
+    - clip==1.0
+    - colorama==0.4.4
+    - commonmark==0.9.1
+    - cython==0.29.30
+    - einops==0.3.2
+    - enum-compat==0.0.3
+    - facexlib==0.2.0.3
+    - filterpy==1.4.5
+    - flatbuffers==1.12
+    - gast==0.4.0
+    - google-pasta==0.2.0
+    - grpcio==1.39.0
+    - h5py==3.1.0
+    - ipdb==0.13.9
+    - jacinle==1.0.0
+    - jmespath==0.10.0
+    - jsonpickle==2.2.0
+    - keras==2.7.0
+    - keras-preprocessing==1.1.2
+    - libclang==12.0.0
+    - llvmlite==0.37.0
+    - lpips==0.1.4
+    - numba==0.54.0
+    - opencv-python==4.5.3.56
+    - opt-einsum==3.3.0
+    - pkgconfig==1.5.5
+    - pyarrow==8.0.0
+    - pydantic==1.8.2
+    - pydeck==0.7.1
+    - pyhocon==0.3.58
+    - pytz-deprecation-shim==0.1.0.post0
+    - pyvis==0.2.1
+    - realesrgan==0.2.2.3
+    - rich==10.9.0
+    - s3transfer==0.5.0
+    - six==1.15.0
+    - sklearn==0.0
+    - streamlit==0.64.0
+    - tabulate==0.8.9
+    - tb-nightly==2.7.0a20210827
+    - tensorflow-estimator==2.7.0
+    - tensorflow-gpu==2.7.0
+    - tensorflow-io-gcs-filesystem==0.21.0
+    - tensorfn==0.1.19
+    - termcolor==1.1.0
+    - toml==0.10.2
+    - torchsample==0.1.3
+    - torchvision==0.10.0+cu111
+    - typing-extensions==3.7.4.3
+    - tzdata==2022.1
+    - tzlocal==4.2
+    - validators==0.19.0
+    - vit-pytorch==0.24.3
+    - watchdog==2.1.8
+    - wrapt==1.12.1
+    - yapf==0.31.0

expansion/__init__.py ADDED Viewed

File without changes

expansion/dataloader/__init__.py ADDED Viewed

File without changes

expansion/dataloader/__pycache__/__init__.cpython-38.pyc ADDED Viewed

Binary file (162 Bytes). View file

expansion/dataloader/__pycache__/seqlist.cpython-38.pyc ADDED Viewed

Binary file (1.12 kB). View file

expansion/dataloader/chairslist.py ADDED Viewed

	@@ -0,0 +1,33 @@

+import torch.utils.data as data
+from PIL import Image
+import os
+import os.path
+import numpy as np
+import glob
+IMG_EXTENSIONS = [
+    '.jpg', '.JPG', '.jpeg', '.JPEG',
+    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',
+]
+def is_image_file(filename):
+    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)
+def dataloader(filepath):
+    l0_train = []
+    l1_train = []
+    flow_train = []
+    for flow_map in sorted(glob.glob(os.path.join(filepath,'*_flow.flo'))):
+        root_filename = flow_map[:-9]
+        img1 = root_filename+'_img1.ppm'
+        img2 = root_filename+'_img2.ppm'
+        if not (os.path.isfile(os.path.join(filepath,img1)) and os.path.isfile(os.path.join(filepath,img2))):
+            continue
+        l0_train.append(img1)
+        l1_train.append(img2)
+        flow_train.append(flow_map)
+    return l0_train, l1_train, flow_train

expansion/dataloader/chairssdlist.py ADDED Viewed

	@@ -0,0 +1,30 @@

+import torch.utils.data as data
+from PIL import Image
+import os
+import os.path
+import numpy as np
+import glob
+IMG_EXTENSIONS = [
+    '.jpg', '.JPG', '.jpeg', '.JPEG',
+    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',
+]
+def is_image_file(filename):
+    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)
+def dataloader(filepath):
+    l0_train = []
+    l1_train = []
+    flow_train = []
+    for flow_map in sorted(glob.glob('%s/flow/*.pfm'%filepath)):
+        img1 = flow_map.replace('flow','t0').replace('.pfm','.png')
+        img2 = flow_map.replace('flow','t1').replace('.pfm','.png')
+        l0_train.append(img1)
+        l1_train.append(img2)
+        flow_train.append(flow_map)
+    return l0_train, l1_train, flow_train

expansion/dataloader/depth_transforms.py ADDED Viewed

	@@ -0,0 +1,471 @@

+from __future__ import division
+import torch
+import random
+import numpy as np
+import numbers
+import types
+import scipy.ndimage as ndimage
+import pdb
+import torchvision
+import PIL.Image as Image
+import cv2
+from torch.nn import functional as F
+class Compose(object):
+    """ Composes several co_transforms together.
+    For example:
+    >>> co_transforms.Compose([
+    >>>     co_transforms.CenterCrop(10),
+    >>>     co_transforms.ToTensor(),
+    >>>  ])
+    """
+    def __init__(self, co_transforms):
+        self.co_transforms = co_transforms
+    def __call__(self, input, target,intr):
+        for t in self.co_transforms:
+            input,target,intr = t(input,target,intr)
+        return input,target,intr
+class Scale(object):
+    """ Rescales the inputs and target arrays to the given 'size'.
+    'size' will be the size of the smaller edge.
+    For example, if height > width, then image will be
+    rescaled to (size * height / width, size)
+    size: size of the smaller edge
+    interpolation order: Default: 2 (bilinear)
+    """
+    def __init__(self, size, order=1):
+        self.ratio = size
+        self.order = order
+        if order==0:
+            self.code=cv2.INTER_NEAREST
+        elif order==1:
+            self.code=cv2.INTER_LINEAR
+        elif order==2:
+            self.code=cv2.INTER_CUBIC
+    def __call__(self, inputs, target):
+        if self.ratio==1:
+            return inputs, target
+        h, w, _ = inputs[0].shape
+        ratio = self.ratio
+        inputs[0] = cv2.resize(inputs[0], None, fx=ratio,fy=ratio,interpolation=cv2.INTER_LINEAR)
+        inputs[1] = cv2.resize(inputs[1], None, fx=ratio,fy=ratio,interpolation=cv2.INTER_LINEAR)
+        # keep the mask same
+        tmp = cv2.resize(target[:,:,2], None, fx=ratio,fy=ratio,interpolation=cv2.INTER_NEAREST)
+        target = cv2.resize(target, None, fx=ratio,fy=ratio,interpolation=self.code) * ratio
+        target[:,:,2] = tmp
+        return inputs, target
+class RandomCrop(object):
+    """Crops the given PIL.Image at a random location to have a region of
+    the given size. size can be a tuple (target_height, target_width)
+    or an integer, in which case the target will be of a square shape (size, size)
+    """
+    def __init__(self, size):
+        if isinstance(size, numbers.Number):
+            self.size = (int(size), int(size))
+        else:
+            self.size = size
+    def __call__(self, inputs,target,intr):
+        h, w, _ = inputs[0].shape
+        th, tw = self.size
+        if w < tw: tw=w
+        if h < th: th=h
+        x1 = random.randint(0, w - tw)
+        y1 = random.randint(0, h - th)
+        intr[1] -= x1
+        intr[2] -= y1
+        inputs[0] = inputs[0][y1: y1 + th,x1: x1 + tw].astype(float)
+        inputs[1] = inputs[1][y1: y1 + th,x1: x1 + tw].astype(float)
+        return inputs, target[y1: y1 + th,x1: x1 + tw].astype(float), list(np.asarray(intr).astype(float)) + list(np.asarray([1.,0.,0.,1.,0.,0.]).astype(float))
+class SpatialAug(object):
+    def __init__(self, crop, scale=None, rot=None, trans=None, squeeze=None, schedule_coeff=1, order=1, black=False):
+        self.crop = crop
+        self.scale = scale
+        self.rot = rot
+        self.trans = trans
+        self.squeeze = squeeze
+        self.t = np.zeros(6)
+        self.schedule_coeff = schedule_coeff
+        self.order = order
+        self.black = black
+    def to_identity(self):
+        self.t[0] = 1; self.t[2] = 0; self.t[4] = 0; self.t[1] = 0; self.t[3] = 1; self.t[5] = 0;
+    def left_multiply(self, u0, u1, u2, u3, u4, u5):
+        result = np.zeros(6)
+        result[0] = self.t[0]*u0 + self.t[1]*u2;
+        result[1] = self.t[0]*u1 + self.t[1]*u3;
+        result[2] = self.t[2]*u0 + self.t[3]*u2;
+        result[3] = self.t[2]*u1 + self.t[3]*u3;
+        result[4] = self.t[4]*u0 + self.t[5]*u2 + u4;
+        result[5] = self.t[4]*u1 + self.t[5]*u3 + u5;
+        self.t = result
+    def inverse(self):
+        result = np.zeros(6)
+        a = self.t[0]; c = self.t[2]; e = self.t[4];
+        b = self.t[1]; d = self.t[3]; f = self.t[5];
+        denom = a*d - b*c;
+        result[0] = d / denom;
+        result[1] = -b / denom;
+        result[2] = -c / denom;
+        result[3] = a / denom;
+        result[4] = (c*f-d*e) / denom;
+        result[5] = (b*e-a*f) / denom;
+        return result
+    def grid_transform(self, meshgrid, t, normalize=True, gridsize=None):
+        if gridsize is None:
+            h, w = meshgrid[0].shape
+        else:
+            h, w = gridsize
+        vgrid = torch.cat([(meshgrid[0] * t[0] + meshgrid[1] * t[2] + t[4])[:,:,np.newaxis],
+                           (meshgrid[0] * t[1] + meshgrid[1] * t[3] + t[5])[:,:,np.newaxis]],-1)
+        if normalize:
+            vgrid[:,:,0] = 2.0*vgrid[:,:,0]/max(w-1,1)-1.0
+            vgrid[:,:,1] = 2.0*vgrid[:,:,1]/max(h-1,1)-1.0
+        return vgrid
+    def __call__(self, inputs, target, intr):
+        h, w, _ = inputs[0].shape
+        th, tw = self.crop
+        meshgrid = torch.meshgrid([torch.Tensor(range(th)), torch.Tensor(range(tw))])[::-1]
+        cornergrid = torch.meshgrid([torch.Tensor([0,th-1]), torch.Tensor([0,tw-1])])[::-1]
+        for i in range(50):
+            # im0
+            self.to_identity()
+            #TODO add mirror
+            if np.random.binomial(1,0.5):
+                mirror = True
+            else:
+                mirror = False
+            ##TODO
+            #mirror = False
+            if mirror:
+                self.left_multiply(-1, 0, 0, 1, .5 * tw, -.5 * th);
+            else:
+                self.left_multiply(1, 0, 0, 1, -.5 * tw, -.5 * th);
+            scale0 = 1; scale1 = 1; squeeze0 = 1; squeeze1 = 1;
+            if not self.rot is None:
+                rot0 = np.random.uniform(-self.rot[0],+self.rot[0])
+                rot1 = np.random.uniform(-self.rot[1]*self.schedule_coeff, self.rot[1]*self.schedule_coeff) + rot0
+                self.left_multiply(np.cos(rot0), np.sin(rot0), -np.sin(rot0), np.cos(rot0), 0, 0)
+            if not self.trans is None:
+                trans0 = np.random.uniform(-self.trans[0],+self.trans[0], 2)
+                trans1 = np.random.uniform(-self.trans[1]*self.schedule_coeff,+self.trans[1]*self.schedule_coeff, 2) + trans0
+                self.left_multiply(1, 0, 0, 1, trans0[0] * tw, trans0[1] * th)
+            if not self.squeeze is None:
+                squeeze0 = np.exp(np.random.uniform(-self.squeeze[0], self.squeeze[0]))
+                squeeze1 = np.exp(np.random.uniform(-self.squeeze[1]*self.schedule_coeff, self.squeeze[1]*self.schedule_coeff)) * squeeze0
+            if not self.scale is None:
+                scale0 = np.exp(np.random.uniform(self.scale[2]-self.scale[0], self.scale[2]+self.scale[0]))
+                scale1 = np.exp(np.random.uniform(-self.scale[1]*self.schedule_coeff, self.scale[1]*self.schedule_coeff)) * scale0
+            self.left_multiply(1.0/(scale0*squeeze0), 0, 0, 1.0/(scale0/squeeze0), 0, 0)
+            self.left_multiply(1, 0, 0, 1, .5 * w, .5 * h);
+            transmat0 = self.t.copy()
+            # im1
+            self.to_identity()
+            if mirror:
+                self.left_multiply(-1, 0, 0, 1, .5 * tw, -.5 * th);
+            else:
+                self.left_multiply(1, 0, 0, 1, -.5 * tw, -.5 * th);
+            if not self.rot is None:
+                self.left_multiply(np.cos(rot1), np.sin(rot1), -np.sin(rot1), np.cos(rot1), 0, 0)
+            if not self.trans is None:
+                self.left_multiply(1, 0, 0, 1, trans1[0] * tw, trans1[1] * th)
+            self.left_multiply(1.0/(scale1*squeeze1), 0, 0, 1.0/(scale1/squeeze1), 0, 0)
+            self.left_multiply(1, 0, 0, 1, .5 * w, .5 * h);
+            transmat1 = self.t.copy()
+            transmat1_inv = self.inverse()
+            if self.black:
+                # black augmentation, allowing 0 values in the input images
+                # https://github.com/lmb-freiburg/flownet2/blob/master/src/caffe/layers/black_augmentation_layer.cu
+                break
+            else:
+                if ((self.grid_transform(cornergrid, transmat0, gridsize=[float(h),float(w)]).abs()>1).sum() +\
+                    (self.grid_transform(cornergrid, transmat1, gridsize=[float(h),float(w)]).abs()>1).sum()) == 0:
+                    break
+        if i==49:
+            print('max_iter in augmentation')
+            self.to_identity()
+            self.left_multiply(1, 0, 0, 1, -.5 * tw, -.5 * th);
+            self.left_multiply(1, 0, 0, 1, .5 * w, .5 * h);
+            transmat0 = self.t.copy()
+            transmat1 = self.t.copy()
+        # do the real work
+        vgrid = self.grid_transform(meshgrid, transmat0,gridsize=[float(h),float(w)])
+        inputs_0 = F.grid_sample(torch.Tensor(inputs[0]).permute(2,0,1)[np.newaxis], vgrid[np.newaxis])[0].permute(1,2,0)
+        if self.order == 0:
+            target_0 = F.grid_sample(torch.Tensor(target).permute(2,0,1)[np.newaxis],    vgrid[np.newaxis], mode='nearest')[0].permute(1,2,0)
+        else:
+            target_0 = F.grid_sample(torch.Tensor(target).permute(2,0,1)[np.newaxis],    vgrid[np.newaxis])[0].permute(1,2,0)
+        mask_0 = target[:,:,2:3].copy(); mask_0[mask_0==0]=np.nan
+        if self.order == 0:
+            mask_0 = F.grid_sample(torch.Tensor(mask_0).permute(2,0,1)[np.newaxis],    vgrid[np.newaxis], mode='nearest')[0].permute(1,2,0)
+        else:
+            mask_0 = F.grid_sample(torch.Tensor(mask_0).permute(2,0,1)[np.newaxis],    vgrid[np.newaxis])[0].permute(1,2,0)
+        mask_0[torch.isnan(mask_0)] = 0
+        vgrid = self.grid_transform(meshgrid, transmat1,gridsize=[float(h),float(w)])
+        inputs_1 = F.grid_sample(torch.Tensor(inputs[1]).permute(2,0,1)[np.newaxis], vgrid[np.newaxis])[0].permute(1,2,0)
+        # flow
+        pos = target_0[:,:,:2] + self.grid_transform(meshgrid, transmat0,normalize=False)
+        pos = self.grid_transform(pos.permute(2,0,1),transmat1_inv,normalize=False)
+        if target_0.shape[2]>=4:
+            # scale
+            exp = target_0[:,:,3:] * scale1 / scale0
+            target = torch.cat([  (pos[:,:,0] - meshgrid[0]).unsqueeze(-1),
+                              (pos[:,:,1] - meshgrid[1]).unsqueeze(-1),
+                               mask_0,
+                               exp], -1)
+        else:
+            target = torch.cat([  (pos[:,:,0] - meshgrid[0]).unsqueeze(-1),
+                              (pos[:,:,1] - meshgrid[1]).unsqueeze(-1),
+                               mask_0], -1)
+        inputs = [np.asarray(inputs_0).astype(float), np.asarray(inputs_1).astype(float)]
+        target = np.asarray(target).astype(float)
+        return inputs,target, list(np.asarray(intr+list(transmat0)).astype(float))
+class pseudoPCAAug(object):
+    """
+    Chromatic Eigen Augmentation: https://github.com/lmb-freiburg/flownet2/blob/master/src/caffe/layers/data_augmentation_layer.cu
+    This version is faster.
+    """
+    def __init__(self, schedule_coeff=1):
+        self.augcolor = torchvision.transforms.ColorJitter(brightness=0.4, contrast=0.4, saturation=0.5, hue=0.5/3.14)
+    def __call__(self, inputs, target,intr):
+        img = np.concatenate([inputs[0],inputs[1]],0)
+        shape = img.shape[0]//2
+        aug_img =  np.asarray(self.augcolor(Image.fromarray(np.uint8(img*255))))/255.
+        inputs[0] = aug_img[:shape]
+        inputs[1] = aug_img[shape:]
+        #inputs[0] = np.asarray(self.augcolor(Image.fromarray(np.uint8(inputs[0]*255))))/255.
+        #inputs[1] = np.asarray(self.augcolor(Image.fromarray(np.uint8(inputs[1]*255))))/255.
+        return inputs,target,intr
+class PCAAug(object):
+    """
+    Chromatic Eigen Augmentation: https://github.com/lmb-freiburg/flownet2/blob/master/src/caffe/layers/data_augmentation_layer.cu
+    """
+    def __init__(self,  lmult_pow  =[0.4, 0,-0.2],
+                        lmult_mult =[0.4, 0,0,  ],
+                        lmult_add  =[0.03,0,0,  ],
+                        sat_pow    =[0.4, 0,0,  ],
+                        sat_mult   =[0.5, 0,-0.3],
+                        sat_add    =[0.03,0,0,  ],
+                        col_pow    =[0.4, 0,0,  ],
+                        col_mult   =[0.2, 0,0,  ],
+                        col_add    =[0.02,0,0,  ],
+                        ladd_pow   =[0.4, 0,0,  ],
+                        ladd_mult  =[0.4, 0,0,  ],
+                        ladd_add   =[0.04,0,0,  ],
+                        col_rotate =[1.,  0,0,  ],
+                        schedule_coeff=1):
+        # no mean
+        self.pow_nomean = [1,1,1]
+        self.add_nomean = [0,0,0]
+        self.mult_nomean = [1,1,1]
+        self.pow_withmean = [1,1,1]
+        self.add_withmean = [0,0,0]
+        self.mult_withmean = [1,1,1]
+        self.lmult_pow = 1
+        self.lmult_mult = 1
+        self.lmult_add = 0
+        self.col_angle = 0
+        if not ladd_pow is None:
+            self.pow_nomean[0] =np.exp(np.random.normal(ladd_pow[2], ladd_pow[0]))
+        if not col_pow is None:
+            self.pow_nomean[1] =np.exp(np.random.normal(col_pow[2], col_pow[0]))
+            self.pow_nomean[2] =np.exp(np.random.normal(col_pow[2], col_pow[0]))
+        if not ladd_add is None:
+            self.add_nomean[0] =np.random.normal(ladd_add[2], ladd_add[0])
+        if not col_add is None:
+            self.add_nomean[1] =np.random.normal(col_add[2], col_add[0])
+            self.add_nomean[2] =np.random.normal(col_add[2], col_add[0])
+        if not ladd_mult is None:
+            self.mult_nomean[0] =np.exp(np.random.normal(ladd_mult[2], ladd_mult[0]))
+        if not col_mult is None:
+            self.mult_nomean[1] =np.exp(np.random.normal(col_mult[2], col_mult[0]))
+            self.mult_nomean[2] =np.exp(np.random.normal(col_mult[2], col_mult[0]))
+        # with mean
+        if not sat_pow is None:
+            self.pow_withmean[1]   =np.exp(np.random.uniform(sat_pow[2]-sat_pow[0], sat_pow[2]+sat_pow[0]))
+            self.pow_withmean[2]   =self.pow_withmean[1]
+        if not sat_add is None:
+            self.add_withmean[1]  =np.random.uniform(sat_add[2]-sat_add[0], sat_add[2]+sat_add[0])
+            self.add_withmean[2]  =self.add_withmean[1]
+        if not sat_mult is None:
+            self.mult_withmean[1] = np.exp(np.random.uniform(sat_mult[2]-sat_mult[0], sat_mult[2]+sat_mult[0]))
+            self.mult_withmean[2] = self.mult_withmean[1]
+        if not lmult_pow is None:
+            self.lmult_pow = np.exp(np.random.uniform(lmult_pow[2]-lmult_pow[0], lmult_pow[2]+lmult_pow[0]))
+        if not lmult_mult is None:
+            self.lmult_mult= np.exp(np.random.uniform(lmult_mult[2]-lmult_mult[0], lmult_mult[2]+lmult_mult[0]))
+        if not lmult_add is None:
+            self.lmult_add = np.random.uniform(lmult_add[2]-lmult_add[0], lmult_add[2]+lmult_add[0])
+        if not col_rotate is None:
+            self.col_angle= np.random.uniform(col_rotate[2]-col_rotate[0], col_rotate[2]+col_rotate[0])
+        # eigen vectors
+        self.eigvec = np.reshape([0.51,0.56,0.65,0.79,0.01,-0.62,0.35,-0.83,0.44],[3,3]).transpose()
+    def __call__(self, inputs, target, intr):
+        inputs[0] = self.pca_image(inputs[0])
+        inputs[1] = self.pca_image(inputs[1])
+        return inputs,target,intr
+    def pca_image(self, rgb):
+        eig = np.dot(rgb, self.eigvec)
+        max_rgb = np.clip(rgb,0,np.inf).max((0,1))
+        min_rgb = rgb.min((0,1))
+        mean_rgb = rgb.mean((0,1))
+        max_abs_eig =  np.abs(eig).max((0,1))
+        max_l = np.sqrt(np.sum(max_abs_eig*max_abs_eig))
+        mean_eig = np.dot(mean_rgb, self.eigvec)
+        # no-mean stuff
+        eig -= mean_eig[np.newaxis, np.newaxis]
+        for c in range(3):
+            if max_abs_eig[c] > 1e-2:
+                mean_eig[c] /= max_abs_eig[c]
+                eig[:,:,c] = eig[:,:,c] / max_abs_eig[c];
+                eig[:,:,c] = np.power(np.abs(eig[:,:,c]),self.pow_nomean[c]) *\
+                             ((eig[:,:,c] > 0) -0.5)*2
+                eig[:,:,c] = eig[:,:,c] + self.add_nomean[c]
+                eig[:,:,c] = eig[:,:,c] * self.mult_nomean[c]
+        eig += mean_eig[np.newaxis,np.newaxis]
+        # withmean stuff
+        if max_abs_eig[0]  > 1e-2:
+            eig[:,:,0] = np.power(np.abs(eig[:,:,0]),self.pow_withmean[0]) * \
+                         ((eig[:,:,0]>0)-0.5)*2;
+            eig[:,:,0] = eig[:,:,0] + self.add_withmean[0];
+            eig[:,:,0] = eig[:,:,0] * self.mult_withmean[0];
+        s = np.sqrt(eig[:,:,1]*eig[:,:,1] + eig[:,:,2] * eig[:,:,2])
+        smask =  s > 1e-2
+        s1 = np.power(s, self.pow_withmean[1]);
+        s1 = np.clip(s1 + self.add_withmean[1], 0,np.inf)
+        s1 = s1 * self.mult_withmean[1]
+        s1 = s1 * smask + s*(1-smask)
+        # color angle
+        if self.col_angle!=0:
+            temp1 =  np.cos(self.col_angle) * eig[:,:,1] - np.sin(self.col_angle) * eig[:,:,2]
+            temp2 =  np.sin(self.col_angle) * eig[:,:,1] + np.cos(self.col_angle) * eig[:,:,2]
+            eig[:,:,1] = temp1
+            eig[:,:,2] = temp2
+        # to origin magnitude
+        for c in range(3):
+            if max_abs_eig[c] > 1e-2:
+                eig[:,:,c] = eig[:,:,c] * max_abs_eig[c]
+        if max_l > 1e-2:
+            l1 = np.sqrt(eig[:,:,0]*eig[:,:,0] + eig[:,:,1]*eig[:,:,1] + eig[:,:,2]*eig[:,:,2])
+            l1 = l1 / max_l
+        eig[:,:,1][smask] = (eig[:,:,1] / s * s1)[smask]
+        eig[:,:,2][smask] = (eig[:,:,2] / s * s1)[smask]
+        #eig[:,:,1] = (eig[:,:,1] / s * s1) * smask + eig[:,:,1] * (1-smask)
+        #eig[:,:,2] = (eig[:,:,2] / s * s1) * smask + eig[:,:,2] * (1-smask)
+        if max_l > 1e-2:
+            l = np.sqrt(eig[:,:,0]*eig[:,:,0] + eig[:,:,1]*eig[:,:,1] + eig[:,:,2]*eig[:,:,2])
+            l1 = np.power(l1, self.lmult_pow)
+            l1 = np.clip(l1 + self.lmult_add, 0, np.inf)
+            l1 = l1 * self.lmult_mult
+            l1 = l1 * max_l
+            lmask = l > 1e-2
+            eig[lmask] = (eig / l[:,:,np.newaxis] * l1[:,:,np.newaxis])[lmask]
+            for c in range(3):
+                eig[:,:,c][lmask] = (np.clip(eig[:,:,c], -np.inf, max_abs_eig[c]))[lmask]
+      #      for c in range(3):
+#     #           eig[:,:,c][lmask] = (eig[:,:,c] / l * l1)[lmask] * lmask + eig[:,:,c] * (1-lmask)
+      #          eig[:,:,c][lmask] = (eig[:,:,c] / l * l1)[lmask]
+      #          eig[:,:,c] = (np.clip(eig[:,:,c], -np.inf, max_abs_eig[c])) * lmask + eig[:,:,c] * (1-lmask)
+        return np.clip(np.dot(eig, self.eigvec.transpose()), 0, 1)
+class ChromaticAug(object):
+    """
+    Chromatic augmentation: https://github.com/lmb-freiburg/flownet2/blob/master/src/caffe/layers/data_augmentation_layer.cu
+    """
+    def __init__(self,  noise = 0.06,
+                        gamma = 0.02,
+                        brightness = 0.02,
+                        contrast = 0.02,
+                        color = 0.02,
+                        schedule_coeff=1):
+        self.noise = np.random.uniform(0,noise)
+        self.gamma = np.exp(np.random.normal(0,       gamma*schedule_coeff))
+        self.brightness = np.random.normal(0,    brightness*schedule_coeff)
+        self.contrast = np.exp(np.random.normal(0, contrast*schedule_coeff))
+        self.color = np.exp(np.random.normal(0,       color*schedule_coeff,3))
+    def __call__(self, inputs, target, intr):
+        inputs[1] = self.chrom_aug(inputs[1])
+        # noise
+        inputs[0]+=np.random.normal(0, self.noise, inputs[0].shape)
+        inputs[1]+=np.random.normal(0, self.noise, inputs[0].shape)
+        return inputs,target,intr
+    def chrom_aug(self, rgb):
+        # color change
+        mean_in = rgb.sum(-1)
+        rgb = rgb*self.color[np.newaxis,np.newaxis]
+        brightness_coeff = mean_in / (rgb.sum(-1)+0.01)
+        rgb = np.clip(rgb*brightness_coeff[:,:,np.newaxis],0,1)
+        # gamma
+        rgb = np.power(rgb,self.gamma)
+        # brightness
+        rgb += self.brightness
+        # contrast
+        rgb = 0.5 + ( rgb-0.5)*self.contrast
+        rgb = np.clip(rgb, 0, 1)
+        return rgb

expansion/dataloader/depthloader.py ADDED Viewed

	@@ -0,0 +1,222 @@

+import os
+import numbers
+import torch
+import torch.utils.data as data
+import torch
+import torchvision.transforms as transforms
+import random
+from PIL import Image, ImageOps
+import numpy as np
+import torchvision
+from . import depth_transforms as flow_transforms
+import pdb
+import cv2
+from utils.flowlib import read_flow
+from utils.util_flow import readPFM, load_calib_cam_to_cam
+def default_loader(path):
+    return Image.open(path).convert('RGB')
+def flow_loader(path):
+    if '.pfm' in path:
+        data =  readPFM(path)[0]
+        data[:,:,2] = 1
+        return data
+    else:
+        return read_flow(path)
+def load_exts(cam_file):
+    with open(cam_file, 'r') as f:
+        lines = f.readlines()
+    l_exts = []
+    r_exts = []
+    for l in lines:
+        if 'L ' in l:
+            l_exts.append(np.asarray([float(i) for i in l[2:].strip().split(' ')]).reshape(4,4))
+        if 'R ' in l:
+            r_exts.append(np.asarray([float(i) for i in l[2:].strip().split(' ')]).reshape(4,4))
+    return l_exts,r_exts
+def disparity_loader(path):
+    if '.png' in path:
+        data = Image.open(path)
+        data = np.ascontiguousarray(data,dtype=np.float32)/256
+        return data
+    else:
+        return readPFM(path)[0]
+# triangulation
+def triangulation(disp, xcoord, ycoord, bl=1, fl = 450, cx = 479.5, cy = 269.5):
+    depth = bl*fl / disp # 450px->15mm focal length
+    X = (xcoord - cx) * depth / fl
+    Y = (ycoord - cy) * depth / fl
+    Z = depth
+    P = np.concatenate((X[np.newaxis],Y[np.newaxis],Z[np.newaxis]),0).reshape(3,-1)
+    P = np.concatenate((P,np.ones((1,P.shape[-1]))),0)
+    return P
+class myImageFloder(data.Dataset):
+    def __init__(self, iml0, iml1, flowl0, loader=default_loader, dploader= flow_loader, scale=1.,shape=[320,448], order=1, noise=0.06, pca_augmentor=True, prob = 1.,sc=False,disp0=None,disp1=None,calib=None ):
+        self.iml0 = iml0
+        self.iml1 = iml1
+        self.flowl0 = flowl0
+        self.loader = loader
+        self.dploader = dploader
+        self.scale=scale
+        self.shape=shape
+        self.order=order
+        self.noise = noise
+        self.pca_augmentor = pca_augmentor
+        self.prob = prob
+        self.sc = sc
+        self.disp0 = disp0
+        self.disp1 = disp1
+        self.calib = calib
+    def __getitem__(self, index):
+        iml0  = self.iml0[index]
+        iml1 = self.iml1[index]
+        flowl0= self.flowl0[index]
+        th, tw = self.shape
+        iml0 = self.loader(iml0)
+        iml1 = self.loader(iml1)
+        # get disparity
+        if self.sc:
+            flowl0 = self.dploader(flowl0)
+            flowl0 = np.ascontiguousarray(flowl0,dtype=np.float32)
+            flowl0[np.isnan(flowl0)] = 1e6 # set to max
+            if 'camera_data.txt' in self.calib[index]:
+                bl=1
+                if '15mm_' in self.calib[index]:
+                    fl=450 # 450
+                else:
+                    fl=1050
+                cx = 479.5
+                cy = 269.5
+                # negative disp
+                d1 = np.abs(disparity_loader(self.disp0[index]))
+                d2 = np.abs(disparity_loader(self.disp1[index]) + d1)
+            elif 'Sintel' in self.calib[index]:
+                fl = 1000
+                bl = 1
+                cx = 511.5
+                cy = 217.5
+                d1 = np.zeros(flowl0.shape[:2])
+                d2 = np.zeros(flowl0.shape[:2])
+            else:
+                ints = load_calib_cam_to_cam(self.calib[index])
+                fl = ints['K_cam2'][0,0]
+                cx = ints['K_cam2'][0,2]
+                cy = ints['K_cam2'][1,2]
+                bl = ints['b20']-ints['b30']
+                d1 = disparity_loader(self.disp0[index])
+                d2 = disparity_loader(self.disp1[index])
+            #flowl0[:,:,2] = (flowl0[:,:,2]==1).astype(float)
+            flowl0[:,:,2] = np.logical_and(np.logical_and(flowl0[:,:,2]==1, d1!=0), d2!=0).astype(float)
+            shape = d1.shape
+            mesh = np.meshgrid(range(shape[1]),range(shape[0]))
+            xcoord = mesh[0].astype(float)
+            ycoord = mesh[1].astype(float)
+            # triangulation in two frames
+            P0 = triangulation(d1, xcoord, ycoord, bl=bl, fl = fl, cx = cx, cy = cy)
+            P1 = triangulation(d2, xcoord + flowl0[:,:,0], ycoord + flowl0[:,:,1], bl=bl, fl = fl, cx = cx, cy = cy)
+            dis0 = P0[2]
+            dis1 = P1[2]
+            change_size =  dis0.reshape(shape).astype(np.float32)
+            flow3d = (P1-P0)[:3].reshape((3,)+shape).transpose((1,2,0))
+            gt_normal = np.concatenate((d1[:,:,np.newaxis],d2[:,:,np.newaxis],d2[:,:,np.newaxis]),-1)
+            change_size = np.concatenate((change_size[:,:,np.newaxis],gt_normal,flow3d),2)
+        else:
+            shape = iml0.size
+            shape=[shape[1],shape[0]]
+            flowl0 = np.zeros((shape[0],shape[1],3))
+            change_size = np.zeros((shape[0],shape[1],7))
+            depth = disparity_loader(self.iml1[index].replace('camera','groundtruth'))
+            change_size[:,:,0] = depth
+            seqid = self.iml0[index].split('/')[-5].rsplit('_',3)[0]
+            ints = load_calib_cam_to_cam('/data/gengshay/KITTI/%s/calib_cam_to_cam.txt'%seqid)
+            fl = ints['K_cam2'][0,0]
+            cx = ints['K_cam2'][0,2]
+            cy = ints['K_cam2'][1,2]
+            bl = ints['b20']-ints['b30']
+        iml1 = np.asarray(iml1)/255.
+        iml0 = np.asarray(iml0)/255.
+        iml0 = iml0[:,:,::-1].copy()
+        iml1 = iml1[:,:,::-1].copy()
+        ## following data augmentation procedure in PWCNet
+        ## https://github.com/lmb-freiburg/flownet2/blob/master/src/caffe/layers/data_augmentation_layer.cu
+        import __main__ # a workaround for "discount_coeff"
+        try:
+            with open('/scratch/gengshay/iter_counts-%d.txt'%int(__main__.args.logname.split('-')[-1]), 'r') as f:
+                iter_counts = int(f.readline())
+        except:
+            iter_counts = 0
+        schedule = [0.5, 1., 50000.]  # initial coeff, final_coeff, half life
+        schedule_coeff = schedule[0] + (schedule[1] - schedule[0]) * \
+          (2/(1+np.exp(-1.0986*iter_counts/schedule[2])) - 1)
+        if self.pca_augmentor:
+            pca_augmentor = flow_transforms.pseudoPCAAug( schedule_coeff=schedule_coeff)
+        else:
+            pca_augmentor = flow_transforms.Scale(1., order=0)
+        if np.random.binomial(1,self.prob):
+            co_transform1 = flow_transforms.Compose([
+                           flow_transforms.SpatialAug([th,tw],
+                                           scale=[0.2,0.,0.1],
+                                           rot=[0.4,0.],
+                                           trans=[0.4,0.],
+                                           squeeze=[0.3,0.], schedule_coeff=schedule_coeff, order=self.order),
+            ])
+        else:
+            co_transform1 = flow_transforms.Compose([
+            flow_transforms.RandomCrop([th,tw]),
+            ])
+        co_transform2 = flow_transforms.Compose([
+            flow_transforms.pseudoPCAAug( schedule_coeff=schedule_coeff),
+            #flow_transforms.PCAAug(schedule_coeff=schedule_coeff),
+            flow_transforms.ChromaticAug( schedule_coeff=schedule_coeff, noise=self.noise),
+            ])
+        flowl0 = np.concatenate([flowl0,change_size],-1)
+        augmented,flowl0,intr = co_transform1([iml0, iml1], flowl0, [fl,cx,cy,bl])
+        imol0 = augmented[0]
+        imol1 = augmented[1]
+        augmented,flowl0,intr = co_transform2(augmented, flowl0, intr)
+        iml0 = augmented[0]
+        iml1 = augmented[1]
+        flowl0 = flowl0.astype(np.float32)
+        change_size = flowl0[:,:,3:]
+        flowl0 = flowl0[:,:,:3]
+        # randomly cover a region
+        sx=0;sy=0;cx=0;cy=0
+        if np.random.binomial(1,0.5):
+            sx = int(np.random.uniform(25,100))
+            sy = int(np.random.uniform(25,100))
+            #sx = int(np.random.uniform(50,150))
+            #sy = int(np.random.uniform(50,150))
+            cx = int(np.random.uniform(sx,iml1.shape[0]-sx))
+            cy = int(np.random.uniform(sy,iml1.shape[1]-sy))
+            iml1[cx-sx:cx+sx,cy-sy:cy+sy] = np.mean(np.mean(iml1,0),0)[np.newaxis,np.newaxis]
+        iml0  = torch.Tensor(np.transpose(iml0,(2,0,1)))
+        iml1  = torch.Tensor(np.transpose(iml1,(2,0,1)))
+        return iml0, iml1, flowl0, change_size, intr, imol0, imol1, np.asarray([cx-sx,cx+sx,cy-sy,cy+sy])
+    def __len__(self):
+        return len(self.iml0)

expansion/dataloader/flow_transforms.py ADDED Viewed

	@@ -0,0 +1,440 @@

+from __future__ import division
+import torch
+import random
+import numpy as np
+import numbers
+import types
+import scipy.ndimage as ndimage
+import pdb
+import torchvision
+import PIL.Image as Image
+import cv2
+from torch.nn import functional as F
+class Compose(object):
+    """ Composes several co_transforms together.
+    For example:
+    >>> co_transforms.Compose([
+    >>>     co_transforms.CenterCrop(10),
+    >>>     co_transforms.ToTensor(),
+    >>>  ])
+    """
+    def __init__(self, co_transforms):
+        self.co_transforms = co_transforms
+    def __call__(self, input, target):
+        for t in self.co_transforms:
+            input,target = t(input,target)
+        return input,target
+class Scale(object):
+    """ Rescales the inputs and target arrays to the given 'size'.
+    'size' will be the size of the smaller edge.
+    For example, if height > width, then image will be
+    rescaled to (size * height / width, size)
+    size: size of the smaller edge
+    interpolation order: Default: 2 (bilinear)
+    """
+    def __init__(self, size, order=1):
+        self.ratio = size
+        self.order = order
+        if order==0:
+            self.code=cv2.INTER_NEAREST
+        elif order==1:
+            self.code=cv2.INTER_LINEAR
+        elif order==2:
+            self.code=cv2.INTER_CUBIC
+    def __call__(self, inputs, target):
+        if self.ratio==1:
+            return inputs, target
+        h, w, _ = inputs[0].shape
+        ratio = self.ratio
+        inputs[0] = cv2.resize(inputs[0], None, fx=ratio,fy=ratio,interpolation=cv2.INTER_LINEAR)
+        inputs[1] = cv2.resize(inputs[1], None, fx=ratio,fy=ratio,interpolation=cv2.INTER_LINEAR)
+        # keep the mask same
+        tmp = cv2.resize(target[:,:,2], None, fx=ratio,fy=ratio,interpolation=cv2.INTER_NEAREST)
+        target = cv2.resize(target, None, fx=ratio,fy=ratio,interpolation=self.code) * ratio
+        target[:,:,2] = tmp
+        return inputs, target
+class SpatialAug(object):
+    def __init__(self, crop, scale=None, rot=None, trans=None, squeeze=None, schedule_coeff=1, order=1, black=False):
+        self.crop = crop
+        self.scale = scale
+        self.rot = rot
+        self.trans = trans
+        self.squeeze = squeeze
+        self.t = np.zeros(6)
+        self.schedule_coeff = schedule_coeff
+        self.order = order
+        self.black = black
+    def to_identity(self):
+        self.t[0] = 1; self.t[2] = 0; self.t[4] = 0; self.t[1] = 0; self.t[3] = 1; self.t[5] = 0;
+    def left_multiply(self, u0, u1, u2, u3, u4, u5):
+        result = np.zeros(6)
+        result[0] = self.t[0]*u0 + self.t[1]*u2;
+        result[1] = self.t[0]*u1 + self.t[1]*u3;
+        result[2] = self.t[2]*u0 + self.t[3]*u2;
+        result[3] = self.t[2]*u1 + self.t[3]*u3;
+        result[4] = self.t[4]*u0 + self.t[5]*u2 + u4;
+        result[5] = self.t[4]*u1 + self.t[5]*u3 + u5;
+        self.t = result
+    def inverse(self):
+        result = np.zeros(6)
+        a = self.t[0]; c = self.t[2]; e = self.t[4];
+        b = self.t[1]; d = self.t[3]; f = self.t[5];
+        denom = a*d - b*c;
+        result[0] = d / denom;
+        result[1] = -b / denom;
+        result[2] = -c / denom;
+        result[3] = a / denom;
+        result[4] = (c*f-d*e) / denom;
+        result[5] = (b*e-a*f) / denom;
+        return result
+    def grid_transform(self, meshgrid, t, normalize=True, gridsize=None):
+        if gridsize is None:
+            h, w = meshgrid[0].shape
+        else:
+            h, w = gridsize
+        vgrid = torch.cat([(meshgrid[0] * t[0] + meshgrid[1] * t[2] + t[4])[:,:,np.newaxis],
+                           (meshgrid[0] * t[1] + meshgrid[1] * t[3] + t[5])[:,:,np.newaxis]],-1)
+        if normalize:
+            vgrid[:,:,0] = 2.0*vgrid[:,:,0]/max(w-1,1)-1.0
+            vgrid[:,:,1] = 2.0*vgrid[:,:,1]/max(h-1,1)-1.0
+        return vgrid
+    def __call__(self, inputs, target):
+        h, w, _ = inputs[0].shape
+        th, tw = self.crop
+        meshgrid = torch.meshgrid([torch.Tensor(range(th)), torch.Tensor(range(tw))])[::-1]
+        cornergrid = torch.meshgrid([torch.Tensor([0,th-1]), torch.Tensor([0,tw-1])])[::-1]
+        for i in range(50):
+            # im0
+            self.to_identity()
+            #TODO add mirror
+            if np.random.binomial(1,0.5):
+                mirror = True
+            else:
+                mirror = False
+            ##TODO
+            #mirror = False
+            if mirror:
+                self.left_multiply(-1, 0, 0, 1, .5 * tw, -.5 * th);
+            else:
+                self.left_multiply(1, 0, 0, 1, -.5 * tw, -.5 * th);
+            scale0 = 1; scale1 = 1; squeeze0 = 1; squeeze1 = 1;
+            if not self.rot is None:
+                rot0 = np.random.uniform(-self.rot[0],+self.rot[0])
+                rot1 = np.random.uniform(-self.rot[1]*self.schedule_coeff, self.rot[1]*self.schedule_coeff) + rot0
+                self.left_multiply(np.cos(rot0), np.sin(rot0), -np.sin(rot0), np.cos(rot0), 0, 0)
+            if not self.trans is None:
+                trans0 = np.random.uniform(-self.trans[0],+self.trans[0], 2)
+                trans1 = np.random.uniform(-self.trans[1]*self.schedule_coeff,+self.trans[1]*self.schedule_coeff, 2) + trans0
+                self.left_multiply(1, 0, 0, 1, trans0[0] * tw, trans0[1] * th)
+            if not self.squeeze is None:
+                squeeze0 = np.exp(np.random.uniform(-self.squeeze[0], self.squeeze[0]))
+                squeeze1 = np.exp(np.random.uniform(-self.squeeze[1]*self.schedule_coeff, self.squeeze[1]*self.schedule_coeff)) * squeeze0
+            if not self.scale is None:
+                scale0 = np.exp(np.random.uniform(self.scale[2]-self.scale[0], self.scale[2]+self.scale[0]))
+                scale1 = np.exp(np.random.uniform(-self.scale[1]*self.schedule_coeff, self.scale[1]*self.schedule_coeff)) * scale0
+            self.left_multiply(1.0/(scale0*squeeze0), 0, 0, 1.0/(scale0/squeeze0), 0, 0)
+            self.left_multiply(1, 0, 0, 1, .5 * w, .5 * h);
+            transmat0 = self.t.copy()
+            # im1
+            self.to_identity()
+            if mirror:
+                self.left_multiply(-1, 0, 0, 1, .5 * tw, -.5 * th);
+            else:
+                self.left_multiply(1, 0, 0, 1, -.5 * tw, -.5 * th);
+            if not self.rot is None:
+                self.left_multiply(np.cos(rot1), np.sin(rot1), -np.sin(rot1), np.cos(rot1), 0, 0)
+            if not self.trans is None:
+                self.left_multiply(1, 0, 0, 1, trans1[0] * tw, trans1[1] * th)
+            self.left_multiply(1.0/(scale1*squeeze1), 0, 0, 1.0/(scale1/squeeze1), 0, 0)
+            self.left_multiply(1, 0, 0, 1, .5 * w, .5 * h);
+            transmat1 = self.t.copy()
+            transmat1_inv = self.inverse()
+            if self.black:
+                # black augmentation, allowing 0 values in the input images
+                # https://github.com/lmb-freiburg/flownet2/blob/master/src/caffe/layers/black_augmentation_layer.cu
+                break
+            else:
+                if ((self.grid_transform(cornergrid, transmat0, gridsize=[float(h),float(w)]).abs()>1).sum() +\
+                    (self.grid_transform(cornergrid, transmat1, gridsize=[float(h),float(w)]).abs()>1).sum()) == 0:
+                    break
+        if i==49:
+            print('max_iter in augmentation')
+            self.to_identity()
+            self.left_multiply(1, 0, 0, 1, -.5 * tw, -.5 * th);
+            self.left_multiply(1, 0, 0, 1, .5 * w, .5 * h);
+            transmat0 = self.t.copy()
+            transmat1 = self.t.copy()
+        # do the real work
+        vgrid = self.grid_transform(meshgrid, transmat0,gridsize=[float(h),float(w)])
+        inputs_0 = F.grid_sample(torch.Tensor(inputs[0]).permute(2,0,1)[np.newaxis], vgrid[np.newaxis])[0].permute(1,2,0)
+        if self.order == 0:
+            target_0 = F.grid_sample(torch.Tensor(target).permute(2,0,1)[np.newaxis],    vgrid[np.newaxis], mode='nearest')[0].permute(1,2,0)
+        else:
+            target_0 = F.grid_sample(torch.Tensor(target).permute(2,0,1)[np.newaxis],    vgrid[np.newaxis])[0].permute(1,2,0)
+        mask_0 = target[:,:,2:3].copy(); mask_0[mask_0==0]=np.nan
+        if self.order == 0:
+            mask_0 = F.grid_sample(torch.Tensor(mask_0).permute(2,0,1)[np.newaxis],    vgrid[np.newaxis], mode='nearest')[0].permute(1,2,0)
+        else:
+            mask_0 = F.grid_sample(torch.Tensor(mask_0).permute(2,0,1)[np.newaxis],    vgrid[np.newaxis])[0].permute(1,2,0)
+        mask_0[torch.isnan(mask_0)] = 0
+        vgrid = self.grid_transform(meshgrid, transmat1,gridsize=[float(h),float(w)])
+        inputs_1 = F.grid_sample(torch.Tensor(inputs[1]).permute(2,0,1)[np.newaxis], vgrid[np.newaxis])[0].permute(1,2,0)
+        # flow
+        pos = target_0[:,:,:2] + self.grid_transform(meshgrid, transmat0,normalize=False)
+        pos = self.grid_transform(pos.permute(2,0,1),transmat1_inv,normalize=False)
+        if target_0.shape[2]>=4:
+            # scale
+            exp = target_0[:,:,3:] * scale1 / scale0
+            target = torch.cat([  (pos[:,:,0] - meshgrid[0]).unsqueeze(-1),
+                              (pos[:,:,1] - meshgrid[1]).unsqueeze(-1),
+                               mask_0,
+                               exp], -1)
+        else:
+            target = torch.cat([  (pos[:,:,0] - meshgrid[0]).unsqueeze(-1),
+                              (pos[:,:,1] - meshgrid[1]).unsqueeze(-1),
+                               mask_0], -1)
+#                               target_0[:,:,2].unsqueeze(-1) ], -1)
+        inputs = [np.asarray(inputs_0), np.asarray(inputs_1)]
+        target = np.asarray(target)
+        return inputs,target
+class pseudoPCAAug(object):
+    """
+    Chromatic Eigen Augmentation: https://github.com/lmb-freiburg/flownet2/blob/master/src/caffe/layers/data_augmentation_layer.cu
+    This version is faster.
+    """
+    def __init__(self, schedule_coeff=1):
+        self.augcolor = torchvision.transforms.ColorJitter(brightness=0.4, contrast=0.4, saturation=0.5, hue=0.5/3.14)
+    def __call__(self, inputs, target):
+        inputs[0] = np.asarray(self.augcolor(Image.fromarray(np.uint8(inputs[0]*255))))/255.
+        inputs[1] = np.asarray(self.augcolor(Image.fromarray(np.uint8(inputs[1]*255))))/255.
+        return inputs,target
+class PCAAug(object):
+    """
+    Chromatic Eigen Augmentation: https://github.com/lmb-freiburg/flownet2/blob/master/src/caffe/layers/data_augmentation_layer.cu
+    """
+    def __init__(self,  lmult_pow  =[0.4, 0,-0.2],
+                        lmult_mult =[0.4, 0,0,  ],
+                        lmult_add  =[0.03,0,0,  ],
+                        sat_pow    =[0.4, 0,0,  ],
+                        sat_mult   =[0.5, 0,-0.3],
+                        sat_add    =[0.03,0,0,  ],
+                        col_pow    =[0.4, 0,0,  ],
+                        col_mult   =[0.2, 0,0,  ],
+                        col_add    =[0.02,0,0,  ],
+                        ladd_pow   =[0.4, 0,0,  ],
+                        ladd_mult  =[0.4, 0,0,  ],
+                        ladd_add   =[0.04,0,0,  ],
+                        col_rotate =[1.,  0,0,  ],
+                        schedule_coeff=1):
+        # no mean
+        self.pow_nomean = [1,1,1]
+        self.add_nomean = [0,0,0]
+        self.mult_nomean = [1,1,1]
+        self.pow_withmean = [1,1,1]
+        self.add_withmean = [0,0,0]
+        self.mult_withmean = [1,1,1]
+        self.lmult_pow = 1
+        self.lmult_mult = 1
+        self.lmult_add = 0
+        self.col_angle = 0
+        if not ladd_pow is None:
+            self.pow_nomean[0] =np.exp(np.random.normal(ladd_pow[2], ladd_pow[0]))
+        if not col_pow is None:
+            self.pow_nomean[1] =np.exp(np.random.normal(col_pow[2], col_pow[0]))
+            self.pow_nomean[2] =np.exp(np.random.normal(col_pow[2], col_pow[0]))
+        if not ladd_add is None:
+            self.add_nomean[0] =np.random.normal(ladd_add[2], ladd_add[0])
+        if not col_add is None:
+            self.add_nomean[1] =np.random.normal(col_add[2], col_add[0])
+            self.add_nomean[2] =np.random.normal(col_add[2], col_add[0])
+        if not ladd_mult is None:
+            self.mult_nomean[0] =np.exp(np.random.normal(ladd_mult[2], ladd_mult[0]))
+        if not col_mult is None:
+            self.mult_nomean[1] =np.exp(np.random.normal(col_mult[2], col_mult[0]))
+            self.mult_nomean[2] =np.exp(np.random.normal(col_mult[2], col_mult[0]))
+        # with mean
+        if not sat_pow is None:
+            self.pow_withmean[1]   =np.exp(np.random.uniform(sat_pow[2]-sat_pow[0], sat_pow[2]+sat_pow[0]))
+            self.pow_withmean[2]   =self.pow_withmean[1]
+        if not sat_add is None:
+            self.add_withmean[1]  =np.random.uniform(sat_add[2]-sat_add[0], sat_add[2]+sat_add[0])
+            self.add_withmean[2]  =self.add_withmean[1]
+        if not sat_mult is None:
+            self.mult_withmean[1] = np.exp(np.random.uniform(sat_mult[2]-sat_mult[0], sat_mult[2]+sat_mult[0]))
+            self.mult_withmean[2] = self.mult_withmean[1]
+        if not lmult_pow is None:
+            self.lmult_pow = np.exp(np.random.uniform(lmult_pow[2]-lmult_pow[0], lmult_pow[2]+lmult_pow[0]))
+        if not lmult_mult is None:
+            self.lmult_mult= np.exp(np.random.uniform(lmult_mult[2]-lmult_mult[0], lmult_mult[2]+lmult_mult[0]))
+        if not lmult_add is None:
+            self.lmult_add = np.random.uniform(lmult_add[2]-lmult_add[0], lmult_add[2]+lmult_add[0])
+        if not col_rotate is None:
+            self.col_angle= np.random.uniform(col_rotate[2]-col_rotate[0], col_rotate[2]+col_rotate[0])
+        # eigen vectors
+        self.eigvec = np.reshape([0.51,0.56,0.65,0.79,0.01,-0.62,0.35,-0.83,0.44],[3,3]).transpose()
+    def __call__(self, inputs, target):
+        inputs[0] = self.pca_image(inputs[0])
+        inputs[1] = self.pca_image(inputs[1])
+        return inputs,target
+    def pca_image(self, rgb):
+        eig = np.dot(rgb, self.eigvec)
+        max_rgb = np.clip(rgb,0,np.inf).max((0,1))
+        min_rgb = rgb.min((0,1))
+        mean_rgb = rgb.mean((0,1))
+        max_abs_eig =  np.abs(eig).max((0,1))
+        max_l = np.sqrt(np.sum(max_abs_eig*max_abs_eig))
+        mean_eig = np.dot(mean_rgb, self.eigvec)
+        # no-mean stuff
+        eig -= mean_eig[np.newaxis, np.newaxis]
+        for c in range(3):
+            if max_abs_eig[c] > 1e-2:
+                mean_eig[c] /= max_abs_eig[c]
+                eig[:,:,c] = eig[:,:,c] / max_abs_eig[c];
+                eig[:,:,c] = np.power(np.abs(eig[:,:,c]),self.pow_nomean[c]) *\
+                             ((eig[:,:,c] > 0) -0.5)*2
+                eig[:,:,c] = eig[:,:,c] + self.add_nomean[c]
+                eig[:,:,c] = eig[:,:,c] * self.mult_nomean[c]
+        eig += mean_eig[np.newaxis,np.newaxis]
+        # withmean stuff
+        if max_abs_eig[0]  > 1e-2:
+            eig[:,:,0] = np.power(np.abs(eig[:,:,0]),self.pow_withmean[0]) * \
+                         ((eig[:,:,0]>0)-0.5)*2;
+            eig[:,:,0] = eig[:,:,0] + self.add_withmean[0];
+            eig[:,:,0] = eig[:,:,0] * self.mult_withmean[0];
+        s = np.sqrt(eig[:,:,1]*eig[:,:,1] + eig[:,:,2] * eig[:,:,2])
+        smask =  s > 1e-2
+        s1 = np.power(s, self.pow_withmean[1]);
+        s1 = np.clip(s1 + self.add_withmean[1], 0,np.inf)
+        s1 = s1 * self.mult_withmean[1]
+        s1 = s1 * smask + s*(1-smask)
+        # color angle
+        if self.col_angle!=0:
+            temp1 =  np.cos(self.col_angle) * eig[:,:,1] - np.sin(self.col_angle) * eig[:,:,2]
+            temp2 =  np.sin(self.col_angle) * eig[:,:,1] + np.cos(self.col_angle) * eig[:,:,2]
+            eig[:,:,1] = temp1
+            eig[:,:,2] = temp2
+        # to origin magnitude
+        for c in range(3):
+            if max_abs_eig[c] > 1e-2:
+                eig[:,:,c] = eig[:,:,c] * max_abs_eig[c]
+        if max_l > 1e-2:
+            l1 = np.sqrt(eig[:,:,0]*eig[:,:,0] + eig[:,:,1]*eig[:,:,1] + eig[:,:,2]*eig[:,:,2])
+            l1 = l1 / max_l
+        eig[:,:,1][smask] = (eig[:,:,1] / s * s1)[smask]
+        eig[:,:,2][smask] = (eig[:,:,2] / s * s1)[smask]
+        #eig[:,:,1] = (eig[:,:,1] / s * s1) * smask + eig[:,:,1] * (1-smask)
+        #eig[:,:,2] = (eig[:,:,2] / s * s1) * smask + eig[:,:,2] * (1-smask)
+        if max_l > 1e-2:
+            l = np.sqrt(eig[:,:,0]*eig[:,:,0] + eig[:,:,1]*eig[:,:,1] + eig[:,:,2]*eig[:,:,2])
+            l1 = np.power(l1, self.lmult_pow)
+            l1 = np.clip(l1 + self.lmult_add, 0, np.inf)
+            l1 = l1 * self.lmult_mult
+            l1 = l1 * max_l
+            lmask = l > 1e-2
+            eig[lmask] = (eig / l[:,:,np.newaxis] * l1[:,:,np.newaxis])[lmask]
+            for c in range(3):
+                eig[:,:,c][lmask] = (np.clip(eig[:,:,c], -np.inf, max_abs_eig[c]))[lmask]
+      #      for c in range(3):
+#     #           eig[:,:,c][lmask] = (eig[:,:,c] / l * l1)[lmask] * lmask + eig[:,:,c] * (1-lmask)
+      #          eig[:,:,c][lmask] = (eig[:,:,c] / l * l1)[lmask]
+      #          eig[:,:,c] = (np.clip(eig[:,:,c], -np.inf, max_abs_eig[c])) * lmask + eig[:,:,c] * (1-lmask)
+        return np.clip(np.dot(eig, self.eigvec.transpose()), 0, 1)
+class ChromaticAug(object):
+    """
+    Chromatic augmentation: https://github.com/lmb-freiburg/flownet2/blob/master/src/caffe/layers/data_augmentation_layer.cu
+    """
+    def __init__(self,  noise = 0.06,
+                        gamma = 0.02,
+                        brightness = 0.02,
+                        contrast = 0.02,
+                        color = 0.02,
+                        schedule_coeff=1):
+        self.noise = np.random.uniform(0,noise)
+        self.gamma = np.exp(np.random.normal(0,       gamma*schedule_coeff))
+        self.brightness = np.random.normal(0,    brightness*schedule_coeff)
+        self.contrast = np.exp(np.random.normal(0, contrast*schedule_coeff))
+        self.color = np.exp(np.random.normal(0,       color*schedule_coeff,3))
+    def __call__(self, inputs, target):
+        inputs[1] = self.chrom_aug(inputs[1])
+        # noise
+        inputs[0]+=np.random.normal(0, self.noise, inputs[0].shape)
+        inputs[1]+=np.random.normal(0, self.noise, inputs[0].shape)
+        return inputs,target
+    def chrom_aug(self, rgb):
+        # color change
+        mean_in = rgb.sum(-1)
+        rgb = rgb*self.color[np.newaxis,np.newaxis]
+        brightness_coeff = mean_in / (rgb.sum(-1)+0.01)
+        rgb = np.clip(rgb*brightness_coeff[:,:,np.newaxis],0,1)
+        # gamma
+        rgb = np.power(rgb,self.gamma)
+        # brightness
+        rgb += self.brightness
+        # contrast
+        rgb = 0.5 + ( rgb-0.5)*self.contrast
+        rgb = np.clip(rgb, 0, 1)
+        return rgb

expansion/dataloader/hd1klist.py ADDED Viewed

	@@ -0,0 +1,29 @@

+import torch.utils.data as data
+from PIL import Image
+import os
+import os.path
+import numpy as np
+import pdb
+IMG_EXTENSIONS = [
+    '.jpg', '.JPG', '.jpeg', '.JPEG',
+    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',
+]
+def is_image_file(filename):
+    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)
+def dataloader(filepath):
+  left_fold  = 'image_2/'
+  train = [img for img in os.listdir(filepath+left_fold) if img.find('HD1K2018') > -1]
+  train = sorted(train)
+  l0_train  = [filepath+left_fold+img for img in train]
+  l0_train  = [img for img in l0_train if '%s_%s.png'%(img.rsplit('_',1)[0],'%04d'%(1+int(img.split('.')[0].split('_')[-1])) ) in l0_train ]
+  l1_train = ['%s_%s.png'%(img.rsplit('_',1)[0],'%04d'%(1+int(img.split('.')[0].split('_')[-1])) ) for img in l0_train]
+  flow_train = [img.replace('image_2','flow_occ') for img in l0_train]
+  return l0_train, l1_train, flow_train

expansion/dataloader/kitti12list.py ADDED Viewed

	@@ -0,0 +1,29 @@

+import torch.utils.data as data
+from PIL import Image
+import os
+import os.path
+import numpy as np
+IMG_EXTENSIONS = [
+    '.jpg', '.JPG', '.jpeg', '.JPEG',
+    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',
+]
+def is_image_file(filename):
+    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)
+def dataloader(filepath):
+  left_fold  = 'colored_0/'
+  flow_noc   = 'flow_occ/'
+  train = [img for img in os.listdir(filepath+left_fold) if img.find('_10') > -1]
+  l0_train  = [filepath+left_fold+img for img in train]
+  l1_train = [filepath+left_fold+img.replace('_10','_11') for img in train]
+  flow_train = [filepath+flow_noc+img for img in train]
+  return l0_train, l1_train, flow_train

expansion/dataloader/kitti15list.py ADDED Viewed

	@@ -0,0 +1,29 @@

+import torch.utils.data as data
+from PIL import Image
+import os
+import os.path
+import numpy as np
+IMG_EXTENSIONS = [
+    '.jpg', '.JPG', '.jpeg', '.JPEG',
+    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',
+]
+def is_image_file(filename):
+    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)
+def dataloader(filepath):
+  left_fold  = 'image_2/'
+  flow_noc   = 'flow_occ/'
+  train = [img for img in os.listdir(filepath+left_fold) if img.find('_10') > -1]
+  l0_train  = [filepath+left_fold+img for img in train]
+  l1_train = [filepath+left_fold+img.replace('_10','_11') for img in train]
+  flow_train = [filepath+flow_noc+img for img in train]
+  return sorted(l0_train), sorted(l1_train), sorted(flow_train)

expansion/dataloader/kitti15list_train.py ADDED Viewed

	@@ -0,0 +1,31 @@

+import torch.utils.data as data
+from PIL import Image
+import os
+import os.path
+import numpy as np
+IMG_EXTENSIONS = [
+    '.jpg', '.JPG', '.jpeg', '.JPEG',
+    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',
+]
+def is_image_file(filename):
+    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)
+def dataloader(filepath):
+  left_fold  = 'image_2/'
+  flow_noc   = 'flow_occ/'
+  train = [img for img in os.listdir(filepath+left_fold) if img.find('_10') > -1]
+  train = [i for i in train if int(i.split('_')[0])%5!=0]
+  l0_train  = [filepath+left_fold+img for img in train]
+  l1_train = [filepath+left_fold+img.replace('_10','_11') for img in train]
+  flow_train = [filepath+flow_noc+img for img in train]
+  return sorted(l0_train), sorted(l1_train), sorted(flow_train)

expansion/dataloader/kitti15list_train_lidar.py ADDED Viewed

	@@ -0,0 +1,34 @@

+import torch.utils.data as data
+from PIL import Image
+import os
+import os.path
+import numpy as np
+IMG_EXTENSIONS = [
+    '.jpg', '.JPG', '.jpeg', '.JPEG',
+    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',
+]
+def is_image_file(filename):
+    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)
+def dataloader(filepath):
+  left_fold  = 'image_2/'
+  flow_noc   = 'flow_occ/'
+  train = [img for img in os.listdir(filepath+left_fold) if img.find('_10') > -1]
+#  train = [i for i in train if int(i.split('_')[0])%5!=0]
+  with open('/data/gengshay/kitti_scene/devkit/mapping/train_mapping.txt','r') as f:
+    flags = [True if len(i)>1 else False for i in f.readlines()]
+  train = [fn for (it,fn) in enumerate(sorted(train)) if flags[it] ][:100]
+  l0_train  = [filepath+left_fold+img for img in train]
+  l1_train = [filepath+left_fold+img.replace('_10','_11') for img in train]
+  flow_train = [filepath+flow_noc+img for img in train]
+  return sorted(l0_train), sorted(l1_train), sorted(flow_train)

expansion/dataloader/kitti15list_val.py ADDED Viewed

	@@ -0,0 +1,31 @@

+import torch.utils.data as data
+from PIL import Image
+import os
+import os.path
+import numpy as np
+IMG_EXTENSIONS = [
+    '.jpg', '.JPG', '.jpeg', '.JPEG',
+    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',
+]
+def is_image_file(filename):
+    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)
+def dataloader(filepath):
+  left_fold  = 'image_2/'
+  flow_noc   = 'flow_occ/'
+  train = [img for img in os.listdir(filepath+left_fold) if img.find('_10') > -1]
+  train = [i for i in train if int(i.split('_')[0])%5==0]
+  l0_train  = [filepath+left_fold+img for img in train]
+  l1_train = [filepath+left_fold+img.replace('_10','_11') for img in train]
+  flow_train = [filepath+flow_noc+img for img in train]
+  return sorted(l0_train), sorted(l1_train), sorted(flow_train)

expansion/dataloader/kitti15list_val_lidar.py ADDED Viewed

	@@ -0,0 +1,34 @@

+import torch.utils.data as data
+from PIL import Image
+import os
+import os.path
+import numpy as np
+IMG_EXTENSIONS = [
+    '.jpg', '.JPG', '.jpeg', '.JPEG',
+    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',
+]
+def is_image_file(filename):
+    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)
+def dataloader(filepath):
+  left_fold  = 'image_2/'
+  flow_noc   = 'flow_occ/'
+  train = [img for img in os.listdir(filepath+left_fold) if img.find('_10') > -1]
+#  train = [i for i in train if int(i.split('_')[0])%5!=0]
+  with open('/data/gengshay/kitti_scene/devkit/mapping/train_mapping.txt','r') as f:
+    flags = [True if len(i)>1 else False for i in f.readlines()]
+  train = [fn for (it,fn) in enumerate(sorted(train)) if flags[it] ][100:]
+  l0_train  = [filepath+left_fold+img for img in train]
+  l1_train = [filepath+left_fold+img.replace('_10','_11') for img in train]
+  flow_train = [filepath+flow_noc+img for img in train]
+  return sorted(l0_train), sorted(l1_train), sorted(flow_train)

expansion/dataloader/kitti15list_val_mr.py ADDED Viewed

	@@ -0,0 +1,41 @@

+import torch.utils.data as data
+from PIL import Image
+import os
+import os.path
+import numpy as np
+IMG_EXTENSIONS = [
+    '.jpg', '.JPG', '.jpeg', '.JPEG',
+    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',
+]
+def is_image_file(filename):
+    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)
+def dataloader(filepath):
+  left_fold  = 'image_2/'
+  flow_noc   = 'flow_occ/'
+  train = [img for img in os.listdir(filepath+left_fold) if 'Kitti' in img and img.find('_10') > -1]
+#  train = [i for i in train if int(i.split('_')[1])%5==0]
+  import pdb; pdb.set_trace()
+  train = sorted([i for i in train if int(i.split('_')[1])%5==0])[0:1]
+  l0_train  = [filepath+left_fold+img for img in train]
+  l1_train = [filepath+left_fold+img.replace('_10','_11') for img in train]
+  flow_train = [filepath+flow_noc+img for img in train]
+  l0_train += [filepath+left_fold+img.replace('_10','_09') for img in train]
+  l1_train += [filepath+left_fold+img for img in train]
+  flow_train += flow_train
+  tmp = l0_train
+  l0_train = l0_train+ [i.replace('rob_flow', 'kitti_scene').replace('Kitti2015_','') for i in l1_train]
+  l1_train = l1_train+tmp
+  flow_train += flow_train
+  return l0_train, l1_train, flow_train

expansion/dataloader/robloader.py ADDED Viewed

	@@ -0,0 +1,133 @@

+import os
+import numbers
+import torch
+import torch.utils.data as data
+import torch
+import torchvision.transforms as transforms
+import random
+from PIL import Image, ImageOps
+import numpy as np
+import torchvision
+from . import flow_transforms
+import pdb
+import cv2
+from utils.flowlib import read_flow
+from utils.util_flow import readPFM
+def default_loader(path):
+    return Image.open(path).convert('RGB')
+def flow_loader(path):
+    if '.pfm' in path:
+        data =  readPFM(path)[0]
+        data[:,:,2] = 1
+        return data
+    else:
+        return read_flow(path)
+def disparity_loader(path):
+    if '.png' in path:
+        data = Image.open(path)
+        data = np.ascontiguousarray(data,dtype=np.float32)/256
+        return data
+    else:
+        return readPFM(path)[0]
+class myImageFloder(data.Dataset):
+    def __init__(self, iml0, iml1, flowl0, loader=default_loader, dploader= flow_loader, scale=1.,shape=[320,448], order=1, noise=0.06, pca_augmentor=True, prob = 1., cover=False, black=False, scale_aug=[0.4,0.2]):
+        self.iml0 = iml0
+        self.iml1 = iml1
+        self.flowl0 = flowl0
+        self.loader = loader
+        self.dploader = dploader
+        self.scale=scale
+        self.shape=shape
+        self.order=order
+        self.noise = noise
+        self.pca_augmentor = pca_augmentor
+        self.prob = prob
+        self.cover = cover
+        self.black = black
+        self.scale_aug = scale_aug
+    def __getitem__(self, index):
+        iml0  = self.iml0[index]
+        iml1 = self.iml1[index]
+        flowl0= self.flowl0[index]
+        th, tw = self.shape
+        iml0 = self.loader(iml0)
+        iml1 = self.loader(iml1)
+        iml1 = np.asarray(iml1)/255.
+        iml0 = np.asarray(iml0)/255.
+        iml0 = iml0[:,:,::-1].copy()
+        iml1 = iml1[:,:,::-1].copy()
+        flowl0 = self.dploader(flowl0)
+        #flowl0[:,:,-1][flowl0[:,:,0]==np.inf]=0  # for gtav window pfm files
+        #flowl0[:,:,0][~flowl0[:,:,2].astype(bool)]=0
+        #flowl0[:,:,1][~flowl0[:,:,2].astype(bool)]=0  # avoid nan in grad
+        flowl0 = np.ascontiguousarray(flowl0,dtype=np.float32)
+        flowl0[np.isnan(flowl0)] = 1e6 # set to max
+        ## following data augmentation procedure in PWCNet
+        ## https://github.com/lmb-freiburg/flownet2/blob/master/src/caffe/layers/data_augmentation_layer.cu
+        import __main__ # a workaround for "discount_coeff"
+        try:
+            with open('iter_counts-%d.txt'%int(__main__.args.logname.split('-')[-1]), 'r') as f:
+                iter_counts = int(f.readline())
+        except:
+            iter_counts = 0
+        schedule = [0.5, 1., 50000.]  # initial coeff, final_coeff, half life
+        schedule_coeff = schedule[0] + (schedule[1] - schedule[0]) * \
+          (2/(1+np.exp(-1.0986*iter_counts/schedule[2])) - 1)
+        if self.pca_augmentor:
+            pca_augmentor = flow_transforms.pseudoPCAAug( schedule_coeff=schedule_coeff)
+        else:
+            pca_augmentor = flow_transforms.Scale(1., order=0)
+        if np.random.binomial(1,self.prob):
+            co_transform = flow_transforms.Compose([
+            flow_transforms.Scale(self.scale, order=self.order),
+            #flow_transforms.SpatialAug([th,tw], trans=[0.2,0.03], order=self.order, black=self.black),
+            flow_transforms.SpatialAug([th,tw],scale=[self.scale_aug[0],0.03,self.scale_aug[1]],
+                                               rot=[0.4,0.03],
+                                               trans=[0.4,0.03],
+                                               squeeze=[0.3,0.], schedule_coeff=schedule_coeff, order=self.order, black=self.black),
+            #flow_transforms.pseudoPCAAug(schedule_coeff=schedule_coeff),
+            flow_transforms.PCAAug(schedule_coeff=schedule_coeff),
+            flow_transforms.ChromaticAug( schedule_coeff=schedule_coeff, noise=self.noise),
+            ])
+        else:
+            co_transform = flow_transforms.Compose([
+            flow_transforms.Scale(self.scale, order=self.order),
+            flow_transforms.SpatialAug([th,tw], trans=[0.4,0.03], order=self.order, black=self.black)
+            ])
+        augmented,flowl0 = co_transform([iml0, iml1], flowl0)
+        iml0 = augmented[0]
+        iml1 = augmented[1]
+        if self.cover:
+            ## randomly cover a region
+            # following sec. 3.2 of http://openaccess.thecvf.com/content_CVPR_2019/html/Yang_Hierarchical_Deep_Stereo_Matching_on_High-Resolution_Images_CVPR_2019_paper.html
+            if np.random.binomial(1,0.5):
+                #sx = int(np.random.uniform(25,100))
+                #sy = int(np.random.uniform(25,100))
+                sx = int(np.random.uniform(50,125))
+                sy = int(np.random.uniform(50,125))
+                #sx = int(np.random.uniform(50,150))
+                #sy = int(np.random.uniform(50,150))
+                cx = int(np.random.uniform(sx,iml1.shape[0]-sx))
+                cy = int(np.random.uniform(sy,iml1.shape[1]-sy))
+                iml1[cx-sx:cx+sx,cy-sy:cy+sy] = np.mean(np.mean(iml1,0),0)[np.newaxis,np.newaxis]
+        iml0  = torch.Tensor(np.transpose(iml0,(2,0,1)))
+        iml1  = torch.Tensor(np.transpose(iml1,(2,0,1)))
+        return iml0, iml1, flowl0
+    def __len__(self):
+        return len(self.iml0)

expansion/dataloader/sceneflowlist.py ADDED Viewed

	@@ -0,0 +1,51 @@

+import os
+import os.path
+import glob
+def dataloader(filepath, level=6):
+    iml0 = []
+    iml1 = []
+    flowl0 = []
+    disp0 = []
+    dispc = []
+    calib = []
+    level_stars = '/*'*level
+    candidate_pool = glob.glob('%s/optical_flow%s'%(filepath,level_stars))
+    for flow_path in sorted(candidate_pool):
+        if 'TEST' in flow_path: continue
+        if 'flower_storm_x2/into_future/right/OpticalFlowIntoFuture_0023_R.pfm' in flow_path:
+            continue
+        if 'flower_storm_x2/into_future/left/OpticalFlowIntoFuture_0023_L.pfm' in flow_path:
+            continue
+        if 'flower_storm_augmented0_x2/into_future/right/OpticalFlowIntoFuture_0023_R.pfm' in flow_path:
+            continue
+        if 'flower_storm_augmented0_x2/into_future/left/OpticalFlowIntoFuture_0023_L.pfm' in flow_path:
+            continue
+        if 'FlyingThings' in flow_path and '_0014_' in flow_path:
+            continue
+        if 'FlyingThings' in flow_path and '_0015_' in flow_path:
+            continue
+        idd = flow_path.split('/')[-1].split('_')[-2]
+        if 'into_future' in flow_path:
+            idd_p1 = '%04d'%(int(idd)+1)
+        else:
+            idd_p1 = '%04d'%(int(idd)-1)
+        if os.path.exists(flow_path.replace(idd,idd_p1)):
+            d0_path = flow_path.replace('/into_future/','/').replace('/into_past/','/').replace('optical_flow','disparity')
+            d0_path = '%s/%s.pfm'%(d0_path.rsplit('/',1)[0],idd)
+            dc_path = flow_path.replace('optical_flow','disparity_change')
+            dc_path = '%s/%s.pfm'%(dc_path.rsplit('/',1)[0],idd)
+            im_path = flow_path.replace('/into_future/','/').replace('/into_past/','/').replace('optical_flow','frames_cleanpass')
+            im0_path = '%s/%s.png'%(im_path.rsplit('/',1)[0],idd)
+            im1_path = '%s/%s.png'%(im_path.rsplit('/',1)[0],idd_p1)
+            #with open('%s/camera_data.txt'%(im0_path.replace('frames_cleanpass','camera_data').rsplit('/',2)[0]),'r') as f:
+            #    if 'FlyingThings' in flow_path and len(f.readlines())!=40:
+            #        print(flow_path)
+            #        continue
+            iml0.append(im0_path)
+            iml1.append(im1_path)
+            flowl0.append(flow_path)
+            disp0.append(d0_path)
+            dispc.append(dc_path)
+            calib.append('%s/camera_data.txt'%(im0_path.replace('frames_cleanpass','camera_data').rsplit('/',2)[0]))
+    return iml0, iml1, flowl0, disp0, dispc, calib

expansion/dataloader/seqlist.py ADDED Viewed

	@@ -0,0 +1,26 @@

+import torch.utils.data as data
+from PIL import Image
+import os
+import os.path
+import numpy as np
+import glob
+IMG_EXTENSIONS = [
+    '.jpg', '.JPG', '.jpeg', '.JPEG',
+    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',
+]
+def is_image_file(filename):
+    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)
+def dataloader(filepath):
+  train = [img for img in sorted(glob.glob('%s/*'%filepath))]
+  l0_train  = train[:-1]
+  l1_train = train[1:]
+  return sorted(l0_train), sorted(l1_train), sorted(l0_train)

expansion/dataloader/sintellist.py ADDED Viewed

	@@ -0,0 +1,32 @@

+import torch.utils.data as data
+from PIL import Image
+import os
+import os.path
+import numpy as np
+import pdb
+IMG_EXTENSIONS = [
+    '.jpg', '.JPG', '.jpeg', '.JPEG',
+    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',
+]
+def is_image_file(filename):
+    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)
+def dataloader(filepath):
+  left_fold  = 'image_2/'
+  train = [img for img in os.listdir(filepath+left_fold) if img.find('Sintel') > -1]
+  l0_train  = [filepath+left_fold+img for img in train]
+  l0_train  = [img for img in l0_train if '%s_%s.png'%(img.rsplit('_',1)[0],'%02d'%(1+int(img.split('.')[0].split('_')[-1])) ) in l0_train ]
+  #l0_train = [i for i in l0_train if not '10.png' in i] # remove 10 as val
+  l1_train = ['%s_%s.png'%(img.rsplit('_',1)[0],'%02d'%(1+int(img.split('.')[0].split('_')[-1])) ) for img in l0_train]
+  flow_train = [img.replace('image_2','flow_occ') for img in l0_train]
+  return l0_train, l1_train, flow_train

expansion/dataloader/sintellist_clean.py ADDED Viewed

	@@ -0,0 +1,31 @@

+import torch.utils.data as data
+from PIL import Image
+import os
+import os.path
+import numpy as np
+import pdb
+IMG_EXTENSIONS = [
+    '.jpg', '.JPG', '.jpeg', '.JPEG',
+    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',
+]
+def is_image_file(filename):
+    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)
+def dataloader(filepath):
+  left_fold  = 'image_2/'
+  train = [img for img in os.listdir(filepath+left_fold) if img.find('Sintel_clean') > -1]
+  l0_train  = [filepath+left_fold+img for img in train]
+  l0_train  = [img for img in l0_train if '%s_%s.png'%(img.rsplit('_',1)[0],'%02d'%(1+int(img.split('.')[0].split('_')[-1])) ) in l0_train ]
+  #l0_train = [i for i in l0_train if not '10.png' in i] # remove 10 as val
+  l1_train = ['%s_%s.png'%(img.rsplit('_',1)[0],'%02d'%(1+int(img.split('.')[0].split('_')[-1])) ) for img in l0_train]
+  flow_train = [img.replace('image_2','flow_occ') for img in l0_train]
+  return l0_train, l1_train, flow_train

expansion/dataloader/sintellist_final.py ADDED Viewed

	@@ -0,0 +1,32 @@

+import torch.utils.data as data
+from PIL import Image
+import os
+import os.path
+import numpy as np
+import pdb
+IMG_EXTENSIONS = [
+    '.jpg', '.JPG', '.jpeg', '.JPEG',
+    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',
+]
+def is_image_file(filename):
+    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)
+def dataloader(filepath):
+  left_fold  = 'image_2/'
+  train = [img for img in os.listdir(filepath+left_fold) if img.find('Sintel_final') > -1]
+  l0_train  = [filepath+left_fold+img for img in train]
+  l0_train  = [img for img in l0_train if '%s_%s.png'%(img.rsplit('_',1)[0],'%02d'%(1+int(img.split('.')[0].split('_')[-1])) ) in l0_train ]
+  #l0_train = [i for i in l0_train if not '10.png' in i] # remove 10 as val
+  l1_train = ['%s_%s.png'%(img.rsplit('_',1)[0],'%02d'%(1+int(img.split('.')[0].split('_')[-1])) ) for img in l0_train]
+  flow_train = [img.replace('image_2','flow_occ') for img in l0_train]
+  pdb.set_trace()
+  return l0_train, l1_train, flow_train

expansion/dataloader/sintellist_train.py ADDED Viewed

	@@ -0,0 +1,32 @@

+import torch.utils.data as data
+from PIL import Image
+import os
+import os.path
+import numpy as np
+import pdb
+IMG_EXTENSIONS = [
+    '.jpg', '.JPG', '.jpeg', '.JPEG',
+    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',
+]
+def is_image_file(filename):
+    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)
+def dataloader(filepath):
+  left_fold  = 'image_2/'
+  train = [img for img in os.listdir(filepath+left_fold) if img.find('Sintel') > -1]
+  l0_train  = [filepath+left_fold+img for img in train]
+  l0_train  = [img for img in l0_train if '%s_%s.png'%(img.rsplit('_',1)[0],'%02d'%(1+int(img.split('.')[0].split('_')[-1])) ) in l0_train ]
+  l0_train = [i for i in l0_train if not(('_2_' in i) and ('alley' not in i) and ('bandage' not in i) and ('sleeping' not in i))] # remove 10 as val
+  l1_train = ['%s_%s.png'%(img.rsplit('_',1)[0],'%02d'%(1+int(img.split('.')[0].split('_')[-1])) ) for img in l0_train]
+  flow_train = [img.replace('image_2','flow_occ') for img in l0_train]
+  return l0_train, l1_train, flow_train

expansion/dataloader/sintellist_val.py ADDED Viewed

	@@ -0,0 +1,34 @@

+import torch.utils.data as data
+from PIL import Image
+import os
+import os.path
+import numpy as np
+import pdb
+IMG_EXTENSIONS = [
+    '.jpg', '.JPG', '.jpeg', '.JPEG',
+    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',
+]
+def is_image_file(filename):
+    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)
+def dataloader(filepath):
+  left_fold  = 'image_2/'
+  train = [img for img in os.listdir(filepath+left_fold) if img.find('Sintel') > -1]
+  l0_train  = [filepath+left_fold+img for img in train]
+  l0_train  = [img for img in l0_train if '%s_%s.png'%(img.rsplit('_',1)[0],'%02d'%(1+int(img.split('.')[0].split('_')[-1])) ) in l0_train ]
+  l0_train = [i for i in l0_train if ('_2_' in i) and ('alley' not in i) and ('bandage' not in i) and ('sleeping' not in i)] # remove 10 as val
+  #l0_train = [i for i in l0_train if not(('_2_' in i) and ('alley' not in i) and ('bandage' not in i) and ('sleeping' not in i))] # remove 10 as val
+  l1_train = ['%s_%s.png'%(img.rsplit('_',1)[0],'%02d'%(1+int(img.split('.')[0].split('_')[-1])) ) for img in l0_train]
+  flow_train = [img.replace('image_2','flow_occ') for img in l0_train]
+  return sorted(l0_train)[::3], sorted(l1_train)[::3], sorted(flow_train)[::3]
+#  return sorted(l0_train)[::10], sorted(l1_train)[::10], sorted(flow_train)[::10]

expansion/dataloader/thingslist.py ADDED Viewed

	@@ -0,0 +1,122 @@

+import torch.utils.data as data
+from PIL import Image
+import os
+import os.path
+import numpy as np
+IMG_EXTENSIONS = [
+    '.jpg', '.JPG', '.jpeg', '.JPEG',
+    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',
+]
+def is_image_file(filename):
+    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)
+def dataloader(filepath):
+  exc_list = [
+'0004117.flo',
+'0003149.flo',
+'0001203.flo',
+'0003147.flo',
+'0003666.flo',
+'0006337.flo',
+'0006336.flo',
+'0007126.flo',
+'0004118.flo',
+]
+  left_fold  = 'image_clean/left/'
+  flow_noc   = 'flow/left/into_future/'
+  train = [img for img in os.listdir(filepath+flow_noc) if np.sum([(k in img) for k in exc_list])==0]
+  l0_trainlf  = [filepath+left_fold+img.replace('flo','png') for img in train]
+  l1_trainlf = ['%s/%s.png'%(img.rsplit('/',1)[0],'%07d'%(1+int(img.split('.')[0].split('/')[-1])) ) for img in l0_trainlf]
+  flow_trainlf = [filepath+flow_noc+img for img in train]
+  exc_list = [
+'0003148.flo',
+'0004117.flo',
+'0002890.flo',
+'0003149.flo',
+'0001203.flo',
+'0003666.flo',
+'0006337.flo',
+'0006336.flo',
+'0004118.flo',
+]
+  left_fold  = 'image_clean/right/'
+  flow_noc   = 'flow/right/into_future/'
+  train = [img for img in os.listdir(filepath+flow_noc) if np.sum([(k in img) for k in exc_list])==0]
+  l0_trainrf = [filepath+left_fold+img.replace('flo','png') for img in train]
+  l1_trainrf = ['%s/%s.png'%(img.rsplit('/',1)[0],'%07d'%(1+int(img.split('.')[0].split('/')[-1])) ) for img in l0_trainrf]
+  flow_trainrf = [filepath+flow_noc+img for img in train]
+  exc_list = [
+'0004237.flo',
+'0004705.flo',
+'0004045.flo',
+'0004346.flo',
+'0000161.flo',
+'0000931.flo',
+'0000121.flo',
+'0010822.flo',
+'0004117.flo',
+'0006023.flo',
+'0005034.flo',
+'0005054.flo',
+'0000162.flo',
+'0000053.flo',
+'0005055.flo',
+'0003147.flo',
+'0004876.flo',
+'0000163.flo',
+'0006878.flo',
+]
+  left_fold  = 'image_clean/left/'
+  flow_noc   = 'flow/left/into_past/'
+  train = [img for img in os.listdir(filepath+flow_noc) if np.sum([(k in img) for k in exc_list])==0]
+  l0_trainlp  = [filepath+left_fold+img.replace('flo','png') for img in train]
+  l1_trainlp = ['%s/%s.png'%(img.rsplit('/',1)[0],'%07d'%(-1+int(img.split('.')[0].split('/')[-1])) ) for img in l0_trainlp]
+  flow_trainlp = [filepath+flow_noc+img for img in train]
+  exc_list = [
+'0003148.flo',
+'0004705.flo',
+'0000161.flo',
+'0000121.flo',
+'0004117.flo',
+'0000160.flo',
+'0005034.flo',
+'0005054.flo',
+'0000162.flo',
+'0000053.flo',
+'0005055.flo',
+'0003147.flo',
+'0001549.flo',
+'0000163.flo',
+'0006336.flo',
+'0001648.flo',
+'0006878.flo',
+]
+  left_fold  = 'image_clean/right/'
+  flow_noc   = 'flow/right/into_past/'
+  train = [img for img in os.listdir(filepath+flow_noc) if np.sum([(k in img) for k in exc_list])==0]
+  l0_trainrp  = [filepath+left_fold+img.replace('flo','png') for img in train]
+  l1_trainrp = ['%s/%s.png'%(img.rsplit('/',1)[0],'%07d'%(-1+int(img.split('.')[0].split('/')[-1])) ) for img in l0_trainrp]
+  flow_trainrp = [filepath+flow_noc+img for img in train]
+  l0_train = l0_trainlf + l0_trainrf + l0_trainlp + l0_trainrp
+  l1_train = l1_trainlf + l1_trainrf + l1_trainlp + l1_trainrp
+  flow_train = flow_trainlf + flow_trainrf + flow_trainlp + flow_trainrp
+  return l0_train, l1_train, flow_train

expansion/models/VCN_exp.py ADDED Viewed

	@@ -0,0 +1,561 @@

+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+from torch.autograd import Variable
+import os
+os.environ['PYTHON_EGG_CACHE'] = 'tmp/' # a writable directory
+import numpy as np
+import math
+import pdb
+import time
+from .submodule import pspnet, bfmodule, conv
+from .conv4d import sepConv4d, sepConv4dBlock, butterfly4D
+class flow_reg(nn.Module):
+    """
+    Soft winner-take-all that selects the most likely diplacement.
+    Set ent=True to enable entropy output.
+    Set maxdisp to adjust maximum allowed displacement towards one side.
+        maxdisp=4 searches for a 9x9 region.
+    Set fac to squeeze search window.
+        maxdisp=4 and fac=2 gives search window of 9x5
+    """
+    def __init__(self, size, ent=False, maxdisp = int(4), fac=1):
+        B,W,H = size
+        super(flow_reg, self).__init__()
+        self.ent = ent
+        self.md = maxdisp
+        self.fac = fac
+        self.truncated = True
+        self.wsize = 3  # by default using truncation 7x7
+        flowrangey = range(-maxdisp,maxdisp+1)
+        flowrangex = range(-int(maxdisp//self.fac),int(maxdisp//self.fac)+1)
+        meshgrid = np.meshgrid(flowrangex,flowrangey)
+        flowy = np.tile( np.reshape(meshgrid[0],[1,2*maxdisp+1,2*int(maxdisp//self.fac)+1,1,1]), (B,1,1,H,W) )
+        flowx = np.tile( np.reshape(meshgrid[1],[1,2*maxdisp+1,2*int(maxdisp//self.fac)+1,1,1]), (B,1,1,H,W) )
+        self.register_buffer('flowx',torch.Tensor(flowx))
+        self.register_buffer('flowy',torch.Tensor(flowy))
+        self.pool3d = nn.MaxPool3d((self.wsize*2+1,self.wsize*2+1,1),stride=1,padding=(self.wsize,self.wsize,0))
+    def forward(self, x):
+        b,u,v,h,w = x.shape
+        oldx = x
+        if self.truncated:
+            # truncated softmax
+            x = x.view(b,u*v,h,w)
+            idx = x.argmax(1)[:,np.newaxis]
+            if x.is_cuda:
+                mask = Variable(torch.cuda.HalfTensor(b,u*v,h,w)).fill_(0)
+            else:
+                mask = Variable(torch.FloatTensor(b,u*v,h,w)).fill_(0)
+            mask.scatter_(1,idx,1)
+            mask = mask.view(b,1,u,v,-1)
+            mask = self.pool3d(mask)[:,0].view(b,u,v,h,w)
+            ninf = x.clone().fill_(-np.inf).view(b,u,v,h,w)
+            x = torch.where(mask.byte(),oldx,ninf)
+        else:
+            self.wsize = (np.sqrt(u*v)-1)/2
+        b,u,v,h,w = x.shape
+        x = F.softmax(x.view(b,-1,h,w),1).view(b,u,v,h,w)
+        outx = torch.sum(torch.sum(x*self.flowx,1),1,keepdim=True)
+        outy = torch.sum(torch.sum(x*self.flowy,1),1,keepdim=True)
+        if self.ent:
+            # local
+            local_entropy = (-x*torch.clamp(x,1e-9,1-1e-9).log()).sum(1).sum(1)[:,np.newaxis]
+            if self.wsize == 0:
+                local_entropy[:] = 1.
+            else:
+                local_entropy /= np.log((self.wsize*2+1)**2)
+            # global
+            x = F.softmax(oldx.view(b,-1,h,w),1).view(b,u,v,h,w)
+            global_entropy = (-x*torch.clamp(x,1e-9,1-1e-9).log()).sum(1).sum(1)[:,np.newaxis]
+            global_entropy /= np.log(x.shape[1]*x.shape[2])
+            return torch.cat([outx,outy],1),torch.cat([local_entropy, global_entropy],1)
+        else:
+            return torch.cat([outx,outy],1),None
+class WarpModule(nn.Module):
+    """
+    taken from https://github.com/NVlabs/PWC-Net/blob/master/PyTorch/models/PWCNet.py
+    """
+    def __init__(self, size):
+        super(WarpModule, self).__init__()
+        B,W,H = size
+        # mesh grid
+        xx = torch.arange(0, W).view(1,-1).repeat(H,1)
+        yy = torch.arange(0, H).view(-1,1).repeat(1,W)
+        xx = xx.view(1,1,H,W).repeat(B,1,1,1)
+        yy = yy.view(1,1,H,W).repeat(B,1,1,1)
+        self.register_buffer('grid',torch.cat((xx,yy),1).float())
+    def forward(self, x, flo):
+        """
+        warp an image/tensor (im2) back to im1, according to the optical flow
+        x: [B, C, H, W] (im2)
+        flo: [B, 2, H, W] flow
+        """
+        B, C, H, W = x.size()
+        vgrid = self.grid + flo
+        # scale grid to [-1,1]
+        vgrid[:,0,:,:] = 2.0*vgrid[:,0,:,:]/max(W-1,1)-1.0
+        vgrid[:,1,:,:] = 2.0*vgrid[:,1,:,:]/max(H-1,1)-1.0
+        vgrid = vgrid.permute(0,2,3,1)
+        output = nn.functional.grid_sample(x, vgrid,align_corners=True)
+        mask = ((vgrid[:,:,:,0].abs()<1) * (vgrid[:,:,:,1].abs()<1)) >0
+        return output*mask.unsqueeze(1).float(), mask
+def get_grid(B,H,W):
+    meshgrid_base = np.meshgrid(range(0,W), range(0,H))[::-1]
+    basey = np.reshape(meshgrid_base[0],[1,1,1,H,W])
+    basex = np.reshape(meshgrid_base[1],[1,1,1,H,W])
+    grid = torch.tensor(np.concatenate((basex.reshape((-1,H,W,1)),basey.reshape((-1,H,W,1))),-1)).cuda().float()
+    return grid.view(1,1,H,W,2)
+class VCN(nn.Module):
+    """
+    VCN.
+    md defines maximum displacement for each level, following a coarse-to-fine-warping scheme
+    fac defines squeeze parameter for the coarsest level
+    """
+    def __init__(self, size, md=[4,4,4,4,4], fac=1.,exp_unc=False):  # exp_uncertainty
+        super(VCN,self).__init__()
+        self.md = md
+        self.fac = fac
+        use_entropy = True
+        withbn = True
+        ## pspnet
+        self.pspnet = pspnet(is_proj=False)
+        ### Volumetric-UNet
+        fdima1 = 128 # 6/5/4
+        fdima2 = 64 # 3/2
+        fdimb1 = 16 # 6/5/4/3
+        fdimb2 = 12 # 2
+        full=False
+        self.f6 = butterfly4D(fdima1, fdimb1,withbn=withbn,full=full)
+        self.p6 = sepConv4d(fdimb1,fdimb1, with_bn=False, full=full)
+        self.f5 = butterfly4D(fdima1, fdimb1,withbn=withbn, full=full)
+        self.p5 = sepConv4d(fdimb1,fdimb1, with_bn=False,full=full)
+        self.f4 = butterfly4D(fdima1, fdimb1,withbn=withbn,full=full)
+        self.p4 = sepConv4d(fdimb1,fdimb1, with_bn=False,full=full)
+        self.f3 = butterfly4D(fdima2, fdimb1,withbn=withbn,full=full)
+        self.p3 = sepConv4d(fdimb1,fdimb1, with_bn=False,full=full)
+        full=True
+        self.f2 = butterfly4D(fdima2, fdimb2,withbn=withbn,full=full)
+        self.p2 = sepConv4d(fdimb2,fdimb2, with_bn=False,full=full)
+        self.flow_reg64 = flow_reg([fdimb1*size[0],size[1]//64,size[2]//64], ent=use_entropy, maxdisp=self.md[0], fac=self.fac)
+        self.flow_reg32 = flow_reg([fdimb1*size[0],size[1]//32,size[2]//32], ent=use_entropy, maxdisp=self.md[1])
+        self.flow_reg16 = flow_reg([fdimb1*size[0],size[1]//16,size[2]//16], ent=use_entropy, maxdisp=self.md[2])
+        self.flow_reg8 =  flow_reg([fdimb1*size[0],size[1]//8,size[2]//8]  , ent=use_entropy, maxdisp=self.md[3])
+        self.flow_reg4 =  flow_reg([fdimb2*size[0],size[1]//4,size[2]//4]  , ent=use_entropy, maxdisp=self.md[4])
+        self.warp5 = WarpModule([size[0],size[1]//32,size[2]//32])
+        self.warp4 = WarpModule([size[0],size[1]//16,size[2]//16])
+        self.warp3 = WarpModule([size[0],size[1]//8,size[2]//8])
+        self.warp2 = WarpModule([size[0],size[1]//4,size[2]//4])
+        ## hypotheses fusion modules, adopted from the refinement module of PWCNet
+        # https://github.com/NVlabs/PWC-Net/blob/master/PyTorch/models/PWCNet.py
+        # c6
+        self.dc6_conv1 = conv(128+4*fdimb1, 128, kernel_size=3, stride=1, padding=1,  dilation=1)
+        self.dc6_conv2 = conv(128,      128, kernel_size=3, stride=1, padding=2,  dilation=2)
+        self.dc6_conv3 = conv(128,      128, kernel_size=3, stride=1, padding=4,  dilation=4)
+        self.dc6_conv4 = conv(128,      96,  kernel_size=3, stride=1, padding=8,  dilation=8)
+        self.dc6_conv5 = conv(96,       64,  kernel_size=3, stride=1, padding=16, dilation=16)
+        self.dc6_conv6 = conv(64,       32,  kernel_size=3, stride=1, padding=1,  dilation=1)
+        self.dc6_conv7 = nn.Conv2d(32,2*fdimb1,kernel_size=3,stride=1,padding=1,bias=True)
+        # c5
+        self.dc5_conv1 = conv(128+4*fdimb1*2, 128, kernel_size=3, stride=1, padding=1,  dilation=1)
+        self.dc5_conv2 = conv(128,      128, kernel_size=3, stride=1, padding=2,  dilation=2)
+        self.dc5_conv3 = conv(128,      128, kernel_size=3, stride=1, padding=4,  dilation=4)
+        self.dc5_conv4 = conv(128,      96,  kernel_size=3, stride=1, padding=8,  dilation=8)
+        self.dc5_conv5 = conv(96,       64,  kernel_size=3, stride=1, padding=16, dilation=16)
+        self.dc5_conv6 = conv(64,       32,  kernel_size=3, stride=1, padding=1,  dilation=1)
+        self.dc5_conv7 = nn.Conv2d(32,2*fdimb1*2,kernel_size=3,stride=1,padding=1,bias=True)
+        # c4
+        self.dc4_conv1 = conv(128+4*fdimb1*3, 128, kernel_size=3, stride=1, padding=1,  dilation=1)
+        self.dc4_conv2 = conv(128,      128, kernel_size=3, stride=1, padding=2,  dilation=2)
+        self.dc4_conv3 = conv(128,      128, kernel_size=3, stride=1, padding=4,  dilation=4)
+        self.dc4_conv4 = conv(128,      96,  kernel_size=3, stride=1, padding=8,  dilation=8)
+        self.dc4_conv5 = conv(96,       64,  kernel_size=3, stride=1, padding=16, dilation=16)
+        self.dc4_conv6 = conv(64,       32,  kernel_size=3, stride=1, padding=1,  dilation=1)
+        self.dc4_conv7 = nn.Conv2d(32,2*fdimb1*3,kernel_size=3,stride=1,padding=1,bias=True)
+        # c3
+        self.dc3_conv1 = conv(64+16*fdimb1, 128, kernel_size=3, stride=1, padding=1,  dilation=1)
+        self.dc3_conv2 = conv(128,      128, kernel_size=3, stride=1, padding=2,  dilation=2)
+        self.dc3_conv3 = conv(128,      128, kernel_size=3, stride=1, padding=4,  dilation=4)
+        self.dc3_conv4 = conv(128,      96,  kernel_size=3, stride=1, padding=8,  dilation=8)
+        self.dc3_conv5 = conv(96,       64,  kernel_size=3, stride=1, padding=16, dilation=16)
+        self.dc3_conv6 = conv(64,       32,  kernel_size=3, stride=1, padding=1,  dilation=1)
+        self.dc3_conv7 = nn.Conv2d(32,8*fdimb1,kernel_size=3,stride=1,padding=1,bias=True)
+        # c2
+        self.dc2_conv1 = conv(64+16*fdimb1+4*fdimb2, 128, kernel_size=3, stride=1, padding=1,  dilation=1)
+        self.dc2_conv2 = conv(128,      128, kernel_size=3, stride=1, padding=2,  dilation=2)
+        self.dc2_conv3 = conv(128,      128, kernel_size=3, stride=1, padding=4,  dilation=4)
+        self.dc2_conv4 = conv(128,      96,  kernel_size=3, stride=1, padding=8,  dilation=8)
+        self.dc2_conv5 = conv(96,       64,  kernel_size=3, stride=1, padding=16, dilation=16)
+        self.dc2_conv6 = conv(64,       32,  kernel_size=3, stride=1, padding=1,  dilation=1)
+        self.dc2_conv7 = nn.Conv2d(32,4*2*fdimb1 + 2*fdimb2,kernel_size=3,stride=1,padding=1,bias=True)
+        self.dc6_conv = nn.Sequential(  self.dc6_conv1,
+                                        self.dc6_conv2,
+                                        self.dc6_conv3,
+                                        self.dc6_conv4,
+                                        self.dc6_conv5,
+                                        self.dc6_conv6,
+                                        self.dc6_conv7)
+        self.dc5_conv = nn.Sequential(  self.dc5_conv1,
+                                        self.dc5_conv2,
+                                        self.dc5_conv3,
+                                        self.dc5_conv4,
+                                        self.dc5_conv5,
+                                        self.dc5_conv6,
+                                        self.dc5_conv7)
+        self.dc4_conv = nn.Sequential(  self.dc4_conv1,
+                                        self.dc4_conv2,
+                                        self.dc4_conv3,
+                                        self.dc4_conv4,
+                                        self.dc4_conv5,
+                                        self.dc4_conv6,
+                                        self.dc4_conv7)
+        self.dc3_conv = nn.Sequential(  self.dc3_conv1,
+                                        self.dc3_conv2,
+                                        self.dc3_conv3,
+                                        self.dc3_conv4,
+                                        self.dc3_conv5,
+                                        self.dc3_conv6,
+                                        self.dc3_conv7)
+        self.dc2_conv = nn.Sequential(  self.dc2_conv1,
+                                        self.dc2_conv2,
+                                        self.dc2_conv3,
+                                        self.dc2_conv4,
+                                        self.dc2_conv5,
+                                        self.dc2_conv6,
+                                        self.dc2_conv7)
+        ## Out-of-range detection
+        self.dc6_convo = nn.Sequential(conv(128+4*fdimb1, 128, kernel_size=3, stride=1, padding=1,  dilation=1),
+                            conv(128,      128, kernel_size=3, stride=1, padding=2,  dilation=2),
+                            conv(128,      128, kernel_size=3, stride=1, padding=4,  dilation=4),
+                            conv(128,      96,  kernel_size=3, stride=1, padding=8,  dilation=8),
+                            conv(96,       64,  kernel_size=3, stride=1, padding=16, dilation=16),
+                            conv(64,       32,  kernel_size=3, stride=1, padding=1,  dilation=1),
+                            nn.Conv2d(32,1,kernel_size=3,stride=1,padding=1,bias=True))
+        self.dc5_convo = nn.Sequential(conv(128+2*4*fdimb1, 128, kernel_size=3, stride=1, padding=1,  dilation=1),
+                            conv(128,      128, kernel_size=3, stride=1, padding=2,  dilation=2),
+                            conv(128,      128, kernel_size=3, stride=1, padding=4,  dilation=4),
+                            conv(128,      96,  kernel_size=3, stride=1, padding=8,  dilation=8),
+                            conv(96,       64,  kernel_size=3, stride=1, padding=16, dilation=16),
+                            conv(64,       32,  kernel_size=3, stride=1, padding=1,  dilation=1),
+                            nn.Conv2d(32,1,kernel_size=3,stride=1,padding=1,bias=True))
+        self.dc4_convo = nn.Sequential(conv(128+3*4*fdimb1, 128, kernel_size=3, stride=1, padding=1,  dilation=1),
+                            conv(128,      128, kernel_size=3, stride=1, padding=2,  dilation=2),
+                            conv(128,      128, kernel_size=3, stride=1, padding=4,  dilation=4),
+                            conv(128,      96,  kernel_size=3, stride=1, padding=8,  dilation=8),
+                            conv(96,       64,  kernel_size=3, stride=1, padding=16, dilation=16),
+                            conv(64,       32,  kernel_size=3, stride=1, padding=1,  dilation=1),
+                            nn.Conv2d(32,1,kernel_size=3,stride=1,padding=1,bias=True))
+        self.dc3_convo = nn.Sequential(conv(64+16*fdimb1, 128, kernel_size=3, stride=1, padding=1,  dilation=1),
+                            conv(128,      128, kernel_size=3, stride=1, padding=2,  dilation=2),
+                            conv(128,      128, kernel_size=3, stride=1, padding=4,  dilation=4),
+                            conv(128,      96,  kernel_size=3, stride=1, padding=8,  dilation=8),
+                            conv(96,       64,  kernel_size=3, stride=1, padding=16, dilation=16),
+                            conv(64,       32,  kernel_size=3, stride=1, padding=1,  dilation=1),
+                            nn.Conv2d(32,1,kernel_size=3,stride=1,padding=1,bias=True))
+        self.dc2_convo = nn.Sequential(conv(64+16*fdimb1+4*fdimb2, 128, kernel_size=3, stride=1, padding=1,  dilation=1),
+                            conv(128,      128, kernel_size=3, stride=1, padding=2,  dilation=2),
+                            conv(128,      128, kernel_size=3, stride=1, padding=4,  dilation=4),
+                            conv(128,      96,  kernel_size=3, stride=1, padding=8,  dilation=8),
+                            conv(96,       64,  kernel_size=3, stride=1, padding=16, dilation=16),
+                            conv(64,       32,  kernel_size=3, stride=1, padding=1,  dilation=1),
+                            nn.Conv2d(32,1,kernel_size=3,stride=1,padding=1,bias=True))
+        # affine-exp
+        self.f3d2v1 = conv(64, 32, kernel_size=3, stride=1, padding=1,dilation=1) #
+        self.f3d2v2 = conv(1,   32, kernel_size=3, stride=1, padding=1,dilation=1) #
+        self.f3d2v3 = conv(1,   32, kernel_size=3, stride=1, padding=1,dilation=1) #
+        self.f3d2v4 = conv(1,   32, kernel_size=3, stride=1, padding=1,dilation=1) #
+        self.f3d2v5 = conv(64,   32, kernel_size=3, stride=1, padding=1,dilation=1) #
+        self.f3d2v6 = conv(12*81,   32, kernel_size=3, stride=1, padding=1,dilation=1) #
+        self.f3d2 = bfmodule(128-64,1)
+        # depth change net
+        self.dcnetv1 = conv(64, 32, kernel_size=3, stride=1, padding=1,dilation=1) #
+        self.dcnetv2 = conv(1,   32, kernel_size=3, stride=1, padding=1,dilation=1) #
+        self.dcnetv3 = conv(1,   32, kernel_size=3, stride=1, padding=1,dilation=1) #
+        self.dcnetv4 = conv(1,   32, kernel_size=3, stride=1, padding=1,dilation=1) #
+        self.dcnetv5 = conv(12*81,   32, kernel_size=3, stride=1, padding=1,dilation=1) #
+        self.dcnetv6 = conv(4,   32, kernel_size=3, stride=1, padding=1,dilation=1) #
+        if exp_unc:
+            self.dcnet = bfmodule(128,2)
+        else:
+            self.dcnet = bfmodule(128,1)
+        for m in self.modules():
+            if isinstance(m, nn.Conv3d):
+                n = m.kernel_size[0] * m.kernel_size[1]*m.kernel_size[2] * m.out_channels
+                m.weight.data.normal_(0, math.sqrt(2. / n))
+                if hasattr(m.bias,'data'):
+                    m.bias.data.zero_()
+        self.facs = [self.fac,1,1,1,1]
+        self.warp_modules = nn.ModuleList([None, self.warp5, self.warp4, self.warp3, self.warp2])
+        self.f_modules = nn.ModuleList([self.f6, self.f5, self.f4, self.f3, self.f2])
+        self.p_modules = nn.ModuleList([self.p6, self.p5, self.p4, self.p3, self.p2])
+        self.reg_modules = nn.ModuleList([self.flow_reg64, self.flow_reg32, self.flow_reg16, self.flow_reg8, self.flow_reg4])
+        self.oor_modules = nn.ModuleList([self.dc6_convo, self.dc5_convo, self.dc4_convo, self.dc3_convo, self.dc2_convo])
+        self.fuse_modules = nn.ModuleList([self.dc6_conv, self.dc5_conv, self.dc4_conv, self.dc3_conv, self.dc2_conv])
+    def corrf(self, refimg_fea, targetimg_fea,maxdisp, fac=1):
+        """
+        slow correlation function
+        """
+        b,c,height,width = refimg_fea.shape
+        if refimg_fea.is_cuda:
+            cost = Variable(torch.cuda.FloatTensor(b,c,2*maxdisp+1,2*int(maxdisp//fac)+1,height,width)).fill_(0.) # b,c,u,v,h,w
+        else:
+            cost = Variable(torch.FloatTensor(b,c,2*maxdisp+1,2*int(maxdisp//fac)+1,height,width)).fill_(0.) # b,c,u,v,h,w
+        for i in range(2*maxdisp+1):
+            ind = i-maxdisp
+            for j in range(2*int(maxdisp//fac)+1):
+                indd = j-int(maxdisp//fac)
+                feata = refimg_fea[:,:,max(0,-indd):height-indd,max(0,-ind):width-ind]
+                featb = targetimg_fea[:,:,max(0,+indd):height+indd,max(0,ind):width+ind]
+                diff = (feata*featb)
+                cost[:, :, i,j,max(0,-indd):height-indd,max(0,-ind):width-ind]   = diff  # standard
+        cost = F.leaky_relu(cost, 0.1,inplace=True)
+        return cost
+    def cost_matching(self,up_flow, c1, c2, flowh, enth, level):
+        """
+        up_flow: upsample coarse flow
+        c1: normalized feature of image 1
+        c2: normalized feature of image 2
+        flowh: flow hypotheses
+        enth: entropy
+        """
+        # normalize
+        c1n = c1 / (c1.norm(dim=1, keepdim=True)+1e-9)
+        c2n = c2 / (c2.norm(dim=1, keepdim=True)+1e-9)
+        # cost volume
+        if level == 0:
+            warp = c2n
+        else:
+            warp,_ = self.warp_modules[level](c2n, up_flow)
+        feat = self.corrf(c1n,warp,self.md[level],fac=self.facs[level])
+        feat = self.f_modules[level](feat)
+        cost = self.p_modules[level](feat) # b, 16, u,v,h,w
+        # soft WTA
+        b,c,u,v,h,w = cost.shape
+        cost = cost.view(-1,u,v,h,w)  # bx16, 9,9,h,w, also predict uncertainty from here
+        flowhh,enthh = self.reg_modules[level](cost) # bx16, 2, h, w
+        flowhh = flowhh.view(b,c,2,h,w)
+        if level > 0:
+            flowhh = flowhh + up_flow[:,np.newaxis]
+        flowhh = flowhh.view(b,-1,h,w) # b, 16*2, h, w
+        enthh =  enthh.view(b,-1,h,w) # b, 16*1, h, w
+        # append coarse hypotheses
+        if level == 0:
+            flowh = flowhh
+            enth = enthh
+        else:
+            flowh = torch.cat((flowhh, F.upsample(flowh.detach()*2, [flowhh.shape[2],flowhh.shape[3]], mode='bilinear')),1) # b, k2--k2, h, w
+            enth = torch.cat((enthh, F.upsample(enth, [flowhh.shape[2],flowhh.shape[3]], mode='bilinear')),1)
+        if self.training or level==4:
+            x = torch.cat((enth.detach(), flowh.detach(), c1),1)
+            oor = self.oor_modules[level](x)[:,0]
+        else: oor = None
+        # hypotheses fusion
+        x = torch.cat((enth.detach(), flowh.detach(), c1),1)
+        va = self.fuse_modules[level](x)
+        va = va.view(b,-1,2,h,w)
+        flow = ( flowh.view(b,-1,2,h,w) * F.softmax(va,1) ).sum(1) # b, 2k, 2, h, w
+        return flow, flowh, enth, oor
+    def affine(self,pref,flow, pw=1):
+        b,_,lh,lw=flow.shape
+        ptar = pref + flow
+        pw = 1
+        pref = F.unfold(pref, (pw*2+1,pw*2+1), padding=(pw)).view(b,2,(pw*2+1)**2,lh,lw)-pref[:,:,np.newaxis]
+        ptar = F.unfold(ptar, (pw*2+1,pw*2+1), padding=(pw)).view(b,2,(pw*2+1)**2,lh,lw)-ptar[:,:,np.newaxis] # b, 2,9,h,w
+        pref = pref.permute(0,3,4,1,2).reshape(b*lh*lw,2,(pw*2+1)**2)
+        ptar = ptar.permute(0,3,4,1,2).reshape(b*lh*lw,2,(pw*2+1)**2)
+        prefprefT = pref.matmul(pref.permute(0,2,1))
+        ppdet = prefprefT[:,0,0]*prefprefT[:,1,1]-prefprefT[:,1,0]*prefprefT[:,0,1]
+        ppinv = torch.cat((prefprefT[:,1,1:],-prefprefT[:,0,1:], -prefprefT[:,1:,0], prefprefT[:,0:1,0]),1).view(-1,2,2)/ppdet.clamp(1e-10,np.inf)[:,np.newaxis,np.newaxis]
+        Affine = ptar.matmul(pref.permute(0,2,1)).matmul(ppinv)
+        Error = (Affine.matmul(pref)-ptar).norm(2,1).mean(1).view(b,1,lh,lw)
+        Avol = (Affine[:,0,0]*Affine[:,1,1]-Affine[:,1,0]*Affine[:,0,1]).view(b,1,lh,lw).abs().clamp(1e-10,np.inf)
+        exp = Avol.sqrt()
+        mask = (exp>0.5) & (exp<2) & (Error<0.1)
+        mask = mask[:,0]
+        exp = exp.clamp(0.5,2)
+        exp[Error>0.1]=1
+        return exp, Error, mask
+    def affine_mask(self,pref,flow, pw=3):
+        """
+        pref: reference coordinates
+        pw: patch width
+        """
+        flmask = flow[:,2:]
+        flow = flow[:,:2]
+        b,_,lh,lw=flow.shape
+        ptar = pref + flow
+        pref = F.unfold(pref, (pw*2+1,pw*2+1), padding=(pw)).view(b,2,(pw*2+1)**2,lh,lw)-pref[:,:,np.newaxis]
+        ptar = F.unfold(ptar, (pw*2+1,pw*2+1), padding=(pw)).view(b,2,(pw*2+1)**2,lh,lw)-ptar[:,:,np.newaxis] # b, 2,9,h,w
+        conf_flow = flmask
+        conf_flow = F.unfold(conf_flow,(pw*2+1,pw*2+1), padding=(pw)).view(b,1,(pw*2+1)**2,lh,lw)
+        count = conf_flow.sum(2,keepdims=True)
+        conf_flow = ((pw*2+1)**2)*conf_flow / count
+        pref = pref * conf_flow
+        ptar = ptar * conf_flow
+        pref = pref.permute(0,3,4,1,2).reshape(b*lh*lw,2,(pw*2+1)**2)
+        ptar = ptar.permute(0,3,4,1,2).reshape(b*lh*lw,2,(pw*2+1)**2)
+        prefprefT = pref.matmul(pref.permute(0,2,1))
+        ppdet = prefprefT[:,0,0]*prefprefT[:,1,1]-prefprefT[:,1,0]*prefprefT[:,0,1]
+        ppinv = torch.cat((prefprefT[:,1,1:],-prefprefT[:,0,1:], -prefprefT[:,1:,0], prefprefT[:,0:1,0]),1).view(-1,2,2)/ppdet.clamp(1e-10,np.inf)[:,np.newaxis,np.newaxis]
+        Affine = ptar.matmul(pref.permute(0,2,1)).matmul(ppinv)
+        Error = (Affine.matmul(pref)-ptar).norm(2,1).mean(1).view(b,1,lh,lw)
+        Avol = (Affine[:,0,0]*Affine[:,1,1]-Affine[:,1,0]*Affine[:,0,1]).view(b,1,lh,lw).abs().clamp(1e-10,np.inf)
+        exp = Avol.sqrt()
+        mask = (exp>0.5) & (exp<2) & (Error<0.2) & (flmask.bool()) & (count[:,0]>4)
+        mask = mask[:,0]
+        exp = exp.clamp(0.5,2)
+        exp[Error>0.2]=1
+        return exp, Error, mask
+    def weight_parameters(self):
+        return [param for name, param in self.named_parameters() if 'weight' in name]
+    def bias_parameters(self):
+        return [param for name, param in self.named_parameters() if 'bias' in name]
+    def forward(self,im,disc_aux=None):
+        bs = im.shape[0]//2
+        if self.training and disc_aux[-1]: # if only fine-tuning expansion
+            reset=True
+            self.eval()
+            torch.set_grad_enabled(False)
+        else: reset=False
+        c06,c05,c04,c03,c02 = self.pspnet(im)
+        c16 = c06[:bs];  c26 = c06[bs:]
+        c15 = c05[:bs];  c25 = c05[bs:]
+        c14 = c04[:bs];  c24 = c04[bs:]
+        c13 = c03[:bs];  c23 = c03[bs:]
+        c12 = c02[:bs];  c22 = c02[bs:]
+        ## matching 6
+        flow6, flow6h, ent6h, oor6 = self.cost_matching(None, c16, c26, None, None,level=0)
+        ## matching 5
+        up_flow6 = F.upsample(flow6, [im.size()[2]//32,im.size()[3]//32], mode='bilinear')*2
+        flow5, flow5h, ent5h, oor5 = self.cost_matching(up_flow6, c15, c25, flow6h, ent6h,level=1)
+        ## matching 4
+        up_flow5 = F.upsample(flow5, [im.size()[2]//16,im.size()[3]//16], mode='bilinear')*2
+        flow4, flow4h, ent4h, oor4 = self.cost_matching(up_flow5, c14, c24, flow5h, ent5h,level=2)
+        ## matching 3
+        up_flow4 = F.upsample(flow4, [im.size()[2]//8,im.size()[3]//8], mode='bilinear')*2
+        flow3, flow3h, ent3h, oor3 = self.cost_matching(up_flow4, c13, c23, flow4h, ent4h,level=3)
+        ## matching 2
+        up_flow3 = F.upsample(flow3, [im.size()[2]//4,im.size()[3]//4], mode='bilinear')*2
+        flow2, flow2h, ent2h, oor2 = self.cost_matching(up_flow3, c12, c22, flow3h, ent3h,level=4)
+        if reset:
+            torch.set_grad_enabled(True)
+            self.train()
+        # expansion
+        b,_,h,w = flow2.shape
+        exp2,err2,_ = self.affine(get_grid(b,h,w)[:,0].permute(0,3,1,2).repeat(b,1,1,1).clone(), flow2.detach(),pw=1)
+        x = torch.cat((
+                       self.f3d2v2(-exp2.log()),
+                       self.f3d2v3(err2),
+                       ),1)
+        dchange2 = -exp2.log()+1./200*self.f3d2(x)[0]
+        # depth change net
+        iexp2 = F.upsample(dchange2.clone(), [im.size()[2],im.size()[3]], mode='bilinear')
+        x = torch.cat((self.dcnetv1(c12.detach()),
+                       self.dcnetv2(dchange2.detach()),
+                       self.dcnetv3(-exp2.log()),
+                       self.dcnetv4(err2),
+                    ),1)
+        dcneto = 1./200*self.dcnet(x)[0]
+        dchange2 = dchange2.detach() + dcneto[:,:1]
+        flow2 = F.upsample(flow2.detach(), [im.size()[2],im.size()[3]], mode='bilinear')*4
+        dchange2 = F.upsample(dchange2, [im.size()[2],im.size()[3]], mode='bilinear')
+        if self.training:
+            flowl0 = disc_aux[0].permute(0,3,1,2).clone()
+            gt_depth = disc_aux[2][:,:,:,0]
+            gt_f3d =  disc_aux[2][:,:,:,4:7].permute(0,3,1,2).clone()
+            gt_dchange = (1+gt_f3d[:,2]/gt_depth)
+            maskdc = (gt_dchange < 2) & (gt_dchange > 0.5) & disc_aux[1]
+            gt_expi,gt_expi_err,maskoe = self.affine_mask(get_grid(b,4*h,4*w)[:,0].permute(0,3,1,2).repeat(b,1,1,1), flowl0,pw=3)
+            gt_exp = 1./gt_expi[:,0]
+            loss =  0.1* (dchange2[:,0]-gt_dchange.log()).abs()[maskdc].mean()
+            loss += 0.1* (iexp2[:,0]-gt_exp.log()).abs()[maskoe].mean()
+            return flow2*4, flow3*8,flow4*16,flow5*32,flow6*64,loss, dchange2[:,0], iexp2[:,0]
+        else:
+            return flow2, oor2,  dchange2, iexp2

expansion/models/__init__.py ADDED Viewed

File without changes

expansion/models/__pycache__/VCN_exp.cpython-38.pyc ADDED Viewed

Binary file (17.7 kB). View file

expansion/models/__pycache__/__init__.cpython-38.pyc ADDED Viewed

Binary file (158 Bytes). View file

expansion/models/__pycache__/conv4d.cpython-38.pyc ADDED Viewed

Binary file (8.26 kB). View file

expansion/models/__pycache__/submodule.cpython-38.pyc ADDED Viewed

Binary file (12 kB). View file

expansion/models/conv4d.py ADDED Viewed

	@@ -0,0 +1,296 @@

+import pdb
+import torch.nn as nn
+import math
+import torch
+from torch.nn.parameter import Parameter
+import torch.nn.functional as F
+from torch.nn import Module
+from torch.nn.modules.conv import _ConvNd
+from torch.nn.modules.utils import _quadruple
+from torch.autograd import Variable
+from torch.nn import Conv2d
+def conv4d(data,filters,bias=None,permute_filters=True,use_half=False):
+    """
+    This is done by stacking results of multiple 3D convolutions, and is very slow.
+    Taken from https://github.com/ignacio-rocco/ncnet
+    """
+    b,c,h,w,d,t=data.size()
+    data=data.permute(2,0,1,3,4,5).contiguous() # permute to avoid making contiguous inside loop
+    # Same permutation is done with filters, unless already provided with permutation
+    if permute_filters:
+        filters=filters.permute(2,0,1,3,4,5).contiguous() # permute to avoid making contiguous inside loop
+    c_out=filters.size(1)
+    if use_half:
+        output = Variable(torch.HalfTensor(h,b,c_out,w,d,t),requires_grad=data.requires_grad)
+    else:
+        output = Variable(torch.zeros(h,b,c_out,w,d,t),requires_grad=data.requires_grad)
+    padding=filters.size(0)//2
+    if use_half:
+        Z=Variable(torch.zeros(padding,b,c,w,d,t).half())
+    else:
+        Z=Variable(torch.zeros(padding,b,c,w,d,t))
+    if data.is_cuda:
+        Z=Z.cuda(data.get_device())
+        output=output.cuda(data.get_device())
+    data_padded = torch.cat((Z,data,Z),0)
+    for i in range(output.size(0)): # loop on first feature dimension
+        # convolve with center channel of filter (at position=padding)
+        output[i,:,:,:,:,:]=F.conv3d(data_padded[i+padding,:,:,:,:,:],
+                                     filters[padding,:,:,:,:,:], bias=bias, stride=1, padding=padding)
+        # convolve with upper/lower channels of filter (at postions [:padding] [padding+1:])
+        for p in range(1,padding+1):
+            output[i,:,:,:,:,:]=output[i,:,:,:,:,:]+F.conv3d(data_padded[i+padding-p,:,:,:,:,:],
+                                                             filters[padding-p,:,:,:,:,:], bias=None, stride=1, padding=padding)
+            output[i,:,:,:,:,:]=output[i,:,:,:,:,:]+F.conv3d(data_padded[i+padding+p,:,:,:,:,:],
+                                                             filters[padding+p,:,:,:,:,:], bias=None, stride=1, padding=padding)
+    output=output.permute(1,2,0,3,4,5).contiguous()
+    return output
+class Conv4d(_ConvNd):
+    """Applies a 4D convolution over an input signal composed of several input
+    planes.
+    """
+    def __init__(self, in_channels, out_channels, kernel_size, bias=True, pre_permuted_filters=True):
+        # stride, dilation and groups !=1 functionality not tested
+        stride=1
+        dilation=1
+        groups=1
+        # zero padding is added automatically in conv4d function to preserve tensor size
+        padding = 0
+        kernel_size = _quadruple(kernel_size)
+        stride = _quadruple(stride)
+        padding = _quadruple(padding)
+        dilation = _quadruple(dilation)
+        super(Conv4d, self).__init__(
+            in_channels, out_channels, kernel_size, stride, padding, dilation,
+            False, _quadruple(0), groups, bias)
+        # weights will be sliced along one dimension during convolution loop
+        # make the looping dimension to be the first one in the tensor,
+        # so that we don't need to call contiguous() inside the loop
+        self.pre_permuted_filters=pre_permuted_filters
+        if self.pre_permuted_filters:
+            self.weight.data=self.weight.data.permute(2,0,1,3,4,5).contiguous()
+        self.use_half=False
+    #    self.isbias = bias
+    #    if not self.isbias:
+    #        self.bn = torch.nn.BatchNorm1d(out_channels)
+    def forward(self, input):
+        out = conv4d(input, self.weight, bias=self.bias,permute_filters=not self.pre_permuted_filters,use_half=self.use_half) # filters pre-permuted in constructor
+    #    if not self.isbias:
+    #        b,c,u,v,h,w = out.shape
+    #        out = self.bn(out.view(b,c,-1)).view(b,c,u,v,h,w)
+        return out
+class fullConv4d(torch.nn.Module):
+    def __init__(self, in_channels, out_channels, kernel_size, bias=True, pre_permuted_filters=True):
+        super(fullConv4d, self).__init__()
+        self.conv = Conv4d(in_channels, out_channels, kernel_size, bias=bias, pre_permuted_filters=pre_permuted_filters)
+        self.isbias = bias
+        if not self.isbias:
+            self.bn = torch.nn.BatchNorm1d(out_channels)
+    def forward(self, input):
+        out = self.conv(input)
+        if not self.isbias:
+            b,c,u,v,h,w = out.shape
+            out = self.bn(out.view(b,c,-1)).view(b,c,u,v,h,w)
+        return out
+class butterfly4D(torch.nn.Module):
+    '''
+    butterfly 4d
+    '''
+    def __init__(self, fdima, fdimb, withbn=True, full=True,groups=1):
+        super(butterfly4D, self).__init__()
+        self.proj = nn.Sequential(projfeat4d(fdima, fdimb, 1, with_bn=withbn,groups=groups),
+                                  nn.ReLU(inplace=True),)
+        self.conva1 = sepConv4dBlock(fdimb,fdimb,with_bn=withbn, stride=(2,1,1),full=full,groups=groups)
+        self.conva2 = sepConv4dBlock(fdimb,fdimb,with_bn=withbn, stride=(2,1,1),full=full,groups=groups)
+        self.convb3 = sepConv4dBlock(fdimb,fdimb,with_bn=withbn, stride=(1,1,1),full=full,groups=groups)
+        self.convb2 = sepConv4dBlock(fdimb,fdimb,with_bn=withbn, stride=(1,1,1),full=full,groups=groups)
+        self.convb1 = sepConv4dBlock(fdimb,fdimb,with_bn=withbn, stride=(1,1,1),full=full,groups=groups)
+    #@profile
+    def forward(self,x):
+        out = self.proj(x)
+        b,c,u,v,h,w = out.shape # 9x9
+        out1 = self.conva1(out) # 5x5, 3
+        _,c1,u1,v1,h1,w1 = out1.shape
+        out2 = self.conva2(out1) # 3x3, 9
+        _,c2,u2,v2,h2,w2 = out2.shape
+        out2 = self.convb3(out2) # 3x3, 9
+        tout1 = F.upsample(out2.view(b,c,u2,v2,-1),(u1,v1,h2*w2),mode='trilinear').view(b,c,u1,v1,h2,w2) # 5x5
+        tout1 = F.upsample(tout1.view(b,c,-1,h2,w2),(u1*v1,h1,w1),mode='trilinear').view(b,c,u1,v1,h1,w1) # 5x5
+        out1 = tout1 + out1
+        out1 = self.convb2(out1)
+        tout = F.upsample(out1.view(b,c,u1,v1,-1),(u,v,h1*w1),mode='trilinear').view(b,c,u,v,h1,w1)
+        tout = F.upsample(tout.view(b,c,-1,h1,w1),(u*v,h,w),mode='trilinear').view(b,c,u,v,h,w)
+        out = tout + out
+        out = self.convb1(out)
+        return out
+class projfeat4d(torch.nn.Module):
+    '''
+    Turn 3d projection into 2d projection
+    '''
+    def __init__(self, in_planes, out_planes, stride, with_bn=True,groups=1):
+        super(projfeat4d, self).__init__()
+        self.with_bn = with_bn
+        self.stride = stride
+        self.conv1 = nn.Conv3d(in_planes, out_planes, 1, (stride,stride,1), padding=0,bias=not with_bn,groups=groups)
+        self.bn = nn.BatchNorm3d(out_planes)
+    def forward(self,x):
+        b,c,u,v,h,w = x.size()
+        x = self.conv1(x.view(b,c,u,v,h*w))
+        if self.with_bn:
+            x = self.bn(x)
+        _,c,u,v,_ = x.shape
+        x = x.view(b,c,u,v,h,w)
+        return x
+class sepConv4d(torch.nn.Module):
+    '''
+    Separable 4d convolution block as 2 3D convolutions
+    '''
+    def __init__(self, in_planes, out_planes, stride=(1,1,1), with_bn=True, ksize=3, full=True,groups=1):
+        super(sepConv4d, self).__init__()
+        bias = not with_bn
+        self.isproj = False
+        self.stride = stride[0]
+        expand = 1
+        if with_bn:
+            if in_planes != out_planes:
+                self.isproj = True
+                self.proj = nn.Sequential(nn.Conv2d(in_planes, out_planes, 1, bias=bias, padding=0,groups=groups),
+                                          nn.BatchNorm2d(out_planes))
+            if full:
+                self.conv1 = nn.Sequential(nn.Conv3d(in_planes*expand, in_planes, (1,ksize,ksize), stride=(1,self.stride,self.stride), bias=bias, padding=(0,ksize//2,ksize//2),groups=groups),
+                                           nn.BatchNorm3d(in_planes))
+            else:
+                self.conv1 = nn.Sequential(nn.Conv3d(in_planes*expand, in_planes, (1,ksize,ksize), stride=1,                           bias=bias, padding=(0,ksize//2,ksize//2),groups=groups),
+                                           nn.BatchNorm3d(in_planes))
+            self.conv2 = nn.Sequential(nn.Conv3d(in_planes, in_planes*expand, (ksize,ksize,1), stride=(self.stride,self.stride,1), bias=bias, padding=(ksize//2,ksize//2,0),groups=groups),
+                                       nn.BatchNorm3d(in_planes*expand))
+        else:
+            if in_planes != out_planes:
+                self.isproj = True
+                self.proj = nn.Conv2d(in_planes, out_planes, 1, bias=bias, padding=0,groups=groups)
+            if full:
+                self.conv1 = nn.Conv3d(in_planes*expand, in_planes, (1,ksize,ksize), stride=(1,self.stride,self.stride), bias=bias, padding=(0,ksize//2,ksize//2),groups=groups)
+            else:
+                self.conv1 = nn.Conv3d(in_planes*expand, in_planes, (1,ksize,ksize), stride=1,                           bias=bias, padding=(0,ksize//2,ksize//2),groups=groups)
+            self.conv2 = nn.Conv3d(in_planes, in_planes*expand, (ksize,ksize,1), stride=(self.stride,self.stride,1), bias=bias, padding=(ksize//2,ksize//2,0),groups=groups)
+        self.relu = nn.ReLU(inplace=True)
+    #@profile
+    def forward(self,x):
+        b,c,u,v,h,w = x.shape
+        x = self.conv2(x.view(b,c,u,v,-1))
+        b,c,u,v,_ = x.shape
+        x = self.relu(x)
+        x = self.conv1(x.view(b,c,-1,h,w))
+        b,c,_,h,w = x.shape
+        if self.isproj:
+            x = self.proj(x.view(b,c,-1,w))
+        x = x.view(b,-1,u,v,h,w)
+        return x
+class sepConv4dBlock(torch.nn.Module):
+    '''
+    Separable 4d convolution block as 2 2D convolutions and a projection
+    layer
+    '''
+    def __init__(self, in_planes, out_planes, stride=(1,1,1), with_bn=True, full=True,groups=1):
+        super(sepConv4dBlock, self).__init__()
+        if in_planes == out_planes and stride==(1,1,1):
+            self.downsample = None
+        else:
+            if full:
+                self.downsample = sepConv4d(in_planes, out_planes, stride, with_bn=with_bn,ksize=1, full=full,groups=groups)
+            else:
+                self.downsample = projfeat4d(in_planes, out_planes,stride[0], with_bn=with_bn,groups=groups)
+        self.conv1 = sepConv4d(in_planes, out_planes, stride, with_bn=with_bn, full=full ,groups=groups)
+        self.conv2 = sepConv4d(out_planes, out_planes,(1,1,1), with_bn=with_bn, full=full,groups=groups)
+        self.relu1 = nn.ReLU(inplace=True)
+        self.relu2 = nn.ReLU(inplace=True)
+    #@profile
+    def forward(self,x):
+        out = self.relu1(self.conv1(x))
+        if self.downsample:
+            x = self.downsample(x)
+        out = self.relu2(x + self.conv2(out))
+        return out
+##import torch.backends.cudnn as cudnn
+##cudnn.benchmark = True
+#import time
+##im = torch.randn(9,64,9,160,224).cuda()
+##net = torch.nn.Conv3d(64, 64, 3).cuda()
+##net = Conv4d(1,1,3,bias=True,pre_permuted_filters=True).cuda()
+##net = sepConv4dBlock(2,2,stride=(1,1,1)).cuda()
+#
+##im = torch.randn(1,16,9,9,96,320).cuda()
+##net = sepConv4d(16,16,with_bn=False).cuda()
+#
+##im = torch.randn(1,16,81,96,320).cuda()
+##net = torch.nn.Conv3d(16,16,(1,3,3),padding=(0,1,1)).cuda()
+#
+##im = torch.randn(1,16,9,9,96*320).cuda()
+##net = torch.nn.Conv3d(16,16,(3,3,1),padding=(1,1,0)).cuda()
+#
+##im = torch.randn(10000,10,9,9).cuda()
+##net = torch.nn.Conv2d(10,10,3,padding=1).cuda()
+#
+##im = torch.randn(81,16,96,320).cuda()
+##net = torch.nn.Conv2d(16,16,3,padding=1).cuda()
+#c=   int(16 *1)
+#cp = int(16 *1)
+#h=int(96  *4)
+#w=int(320 *4)
+#k=3
+#im = torch.randn(1,c,h,w).cuda()
+#net = torch.nn.Conv2d(c,cp,k,padding=k//2).cuda()
+#
+#im2 = torch.randn(cp,k*k*c).cuda()
+#im1 = F.unfold(im, (k,k), padding=k//2)[0]
+#
+#
+#net(im)
+#net(im)
+#torch.mm(im2,im1)
+#torch.mm(im2,im1)
+#torch.cuda.synchronize()
+#beg = time.time()
+#for i in range(100):
+#    net(im)
+#    #im1 = F.unfold(im, (k,k), padding=k//2)[0]
+#    torch.mm(im2,im1)
+#torch.cuda.synchronize()
+#print('%f'%((time.time()-beg)*10.))

expansion/models/submodule.py ADDED Viewed

	@@ -0,0 +1,450 @@

+from __future__ import print_function
+import torch
+import torch.nn as nn
+import torch.utils.data
+from torch.autograd import Variable
+import torch.nn.functional as F
+import math
+import numpy as np
+import pdb
+class residualBlock(nn.Module):
+    expansion = 1
+    def __init__(self, in_channels, n_filters, stride=1, downsample=None,dilation=1,with_bn=True):
+        super(residualBlock, self).__init__()
+        if dilation > 1:
+            padding = dilation
+        else:
+            padding = 1
+        if with_bn:
+            self.convbnrelu1 = conv2DBatchNormRelu(in_channels, n_filters, 3,  stride, padding, dilation=dilation)
+            self.convbn2 = conv2DBatchNorm(n_filters, n_filters, 3, 1, 1)
+        else:
+            self.convbnrelu1 = conv2DBatchNormRelu(in_channels, n_filters, 3,  stride, padding, dilation=dilation,with_bn=False)
+            self.convbn2 = conv2DBatchNorm(n_filters, n_filters, 3, 1, 1, with_bn=False)
+        self.downsample = downsample
+        self.relu = nn.LeakyReLU(0.1, inplace=True)
+    def forward(self, x):
+        residual = x
+        out = self.convbnrelu1(x)
+        out = self.convbn2(out)
+        if self.downsample is not None:
+            residual = self.downsample(x)
+        out += residual
+        return self.relu(out)
+def conv(in_planes, out_planes, kernel_size=3, stride=1, padding=1, dilation=1):
+    return nn.Sequential(
+            nn.Conv2d(in_planes, out_planes, kernel_size=kernel_size, stride=stride,
+                        padding=padding, dilation=dilation, bias=True),
+            nn.BatchNorm2d(out_planes),
+            nn.LeakyReLU(0.1,inplace=True))
+class conv2DBatchNorm(nn.Module):
+    def __init__(self, in_channels, n_filters, k_size,  stride, padding, dilation=1, with_bn=True):
+        super(conv2DBatchNorm, self).__init__()
+        bias = not with_bn
+        if dilation > 1:
+            conv_mod = nn.Conv2d(int(in_channels), int(n_filters), kernel_size=k_size,
+                                 padding=padding, stride=stride, bias=bias, dilation=dilation)
+        else:
+            conv_mod = nn.Conv2d(int(in_channels), int(n_filters), kernel_size=k_size,
+                                 padding=padding, stride=stride, bias=bias, dilation=1)
+        if with_bn:
+            self.cb_unit = nn.Sequential(conv_mod,
+                                         nn.BatchNorm2d(int(n_filters)),)
+        else:
+            self.cb_unit = nn.Sequential(conv_mod,)
+    def forward(self, inputs):
+        outputs = self.cb_unit(inputs)
+        return outputs
+class conv2DBatchNormRelu(nn.Module):
+    def __init__(self, in_channels, n_filters, k_size,  stride, padding, dilation=1, with_bn=True):
+        super(conv2DBatchNormRelu, self).__init__()
+        bias = not with_bn
+        if dilation > 1:
+            conv_mod = nn.Conv2d(int(in_channels), int(n_filters), kernel_size=k_size,
+                                 padding=padding, stride=stride, bias=bias, dilation=dilation)
+        else:
+            conv_mod = nn.Conv2d(int(in_channels), int(n_filters), kernel_size=k_size,
+                                 padding=padding, stride=stride, bias=bias, dilation=1)
+        if with_bn:
+            self.cbr_unit = nn.Sequential(conv_mod,
+                                          nn.BatchNorm2d(int(n_filters)),
+                                          nn.LeakyReLU(0.1, inplace=True),)
+        else:
+            self.cbr_unit = nn.Sequential(conv_mod,
+                                          nn.LeakyReLU(0.1, inplace=True),)
+    def forward(self, inputs):
+        outputs = self.cbr_unit(inputs)
+        return outputs
+class pyramidPooling(nn.Module):
+    def __init__(self, in_channels, with_bn=True, levels=4):
+        super(pyramidPooling, self).__init__()
+        self.levels = levels
+        self.paths = []
+        for i in range(levels):
+            self.paths.append(conv2DBatchNormRelu(in_channels, in_channels, 1, 1, 0, with_bn=with_bn))
+        self.path_module_list = nn.ModuleList(self.paths)
+        self.relu = nn.LeakyReLU(0.1, inplace=True)
+    def forward(self, x):
+        h, w = x.shape[2:]
+        k_sizes = []
+        strides = []
+        for pool_size in np.linspace(1,min(h,w)//2,self.levels,dtype=int):
+            k_sizes.append((int(h/pool_size), int(w/pool_size)))
+            strides.append((int(h/pool_size), int(w/pool_size)))
+        k_sizes = k_sizes[::-1]
+        strides = strides[::-1]
+        pp_sum = x
+        for i, module in enumerate(self.path_module_list):
+            out = F.avg_pool2d(x, k_sizes[i], stride=strides[i], padding=0)
+            out = module(out)
+            out = F.upsample(out, size=(h,w), mode='bilinear')
+            pp_sum = pp_sum + 1./self.levels*out
+        pp_sum = self.relu(pp_sum/2.)
+        return pp_sum
+class pspnet(nn.Module):
+    """
+    Modified PSPNet.  https://github.com/meetshah1995/pytorch-semseg/blob/master/ptsemseg/models/pspnet.py
+    """
+    def __init__(self, is_proj=True,groups=1):
+        super(pspnet, self).__init__()
+        self.inplanes = 32
+        self.is_proj = is_proj
+        # Encoder
+        self.convbnrelu1_1 = conv2DBatchNormRelu(in_channels=3, k_size=3, n_filters=16,
+                                                 padding=1, stride=2)
+        self.convbnrelu1_2 = conv2DBatchNormRelu(in_channels=16, k_size=3, n_filters=16,
+                                                 padding=1, stride=1)
+        self.convbnrelu1_3 = conv2DBatchNormRelu(in_channels=16, k_size=3, n_filters=32,
+                                                 padding=1, stride=1)
+        # Vanilla Residual Blocks
+        self.res_block3 = self._make_layer(residualBlock,64,1,stride=2)
+        self.res_block5 = self._make_layer(residualBlock,128,1,stride=2)
+        self.res_block6 = self._make_layer(residualBlock,128,1,stride=2)
+        self.res_block7 = self._make_layer(residualBlock,128,1,stride=2)
+        self.pyramid_pooling = pyramidPooling(128, levels=3)
+        # Iconvs
+        self.upconv6 = nn.Sequential(nn.Upsample(scale_factor=2),
+                                     conv2DBatchNormRelu(in_channels=128, k_size=3, n_filters=64,
+                                                 padding=1, stride=1))
+        self.iconv5 = conv2DBatchNormRelu(in_channels=192, k_size=3, n_filters=128,
+                                                 padding=1, stride=1)
+        self.upconv5 = nn.Sequential(nn.Upsample(scale_factor=2),
+                                     conv2DBatchNormRelu(in_channels=128, k_size=3, n_filters=64,
+                                                 padding=1, stride=1))
+        self.iconv4 = conv2DBatchNormRelu(in_channels=192, k_size=3, n_filters=128,
+                                                 padding=1, stride=1)
+        self.upconv4 = nn.Sequential(nn.Upsample(scale_factor=2),
+                                     conv2DBatchNormRelu(in_channels=128, k_size=3, n_filters=64,
+                                                 padding=1, stride=1))
+        self.iconv3 = conv2DBatchNormRelu(in_channels=128, k_size=3, n_filters=64,
+                                                 padding=1, stride=1)
+        self.upconv3 = nn.Sequential(nn.Upsample(scale_factor=2),
+                                     conv2DBatchNormRelu(in_channels=64, k_size=3, n_filters=32,
+                                                 padding=1, stride=1))
+        self.iconv2 = conv2DBatchNormRelu(in_channels=64, k_size=3, n_filters=64,
+                                                 padding=1, stride=1)
+        if self.is_proj:
+            self.proj6 = conv2DBatchNormRelu(in_channels=128,k_size=1,n_filters=128//groups, padding=0,stride=1)
+            self.proj5 = conv2DBatchNormRelu(in_channels=128,k_size=1,n_filters=128//groups, padding=0,stride=1)
+            self.proj4 = conv2DBatchNormRelu(in_channels=128,k_size=1,n_filters=128//groups, padding=0,stride=1)
+            self.proj3 = conv2DBatchNormRelu(in_channels=64, k_size=1,n_filters=64//groups, padding=0,stride=1)
+            self.proj2 = conv2DBatchNormRelu(in_channels=64, k_size=1,n_filters=64//groups, padding=0,stride=1)
+        for m in self.modules():
+            if isinstance(m, nn.Conv2d):
+                n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
+                m.weight.data.normal_(0, math.sqrt(2. / n))
+                if hasattr(m.bias,'data'):
+                    m.bias.data.zero_()
+    def _make_layer(self, block, planes, blocks, stride=1):
+        downsample = None
+        if stride != 1 or self.inplanes != planes * block.expansion:
+            downsample = nn.Sequential(nn.Conv2d(self.inplanes, planes * block.expansion,
+                                                 kernel_size=1, stride=stride, bias=False),
+                                       nn.BatchNorm2d(planes * block.expansion),)
+        layers = []
+        layers.append(block(self.inplanes, planes, stride, downsample))
+        self.inplanes = planes * block.expansion
+        for i in range(1, blocks):
+            layers.append(block(self.inplanes, planes))
+        return nn.Sequential(*layers)
+    def forward(self, x):
+        # H, W -> H/2, W/2
+        conv1 = self.convbnrelu1_1(x)
+        conv1 = self.convbnrelu1_2(conv1)
+        conv1 = self.convbnrelu1_3(conv1)
+        ## H/2, W/2 -> H/4, W/4
+        pool1 = F.max_pool2d(conv1, 3, 2, 1)
+        # H/4, W/4 -> H/16, W/16
+        rconv3 = self.res_block3(pool1)
+        conv4 = self.res_block5(rconv3)
+        conv5 = self.res_block6(conv4)
+        conv6 = self.res_block7(conv5)
+        conv6 = self.pyramid_pooling(conv6)
+        conv6x = F.upsample(conv6, [conv5.size()[2],conv5.size()[3]],mode='bilinear')
+        concat5 = torch.cat((conv5,self.upconv6[1](conv6x)),dim=1)
+        conv5 = self.iconv5(concat5)
+        conv5x = F.upsample(conv5, [conv4.size()[2],conv4.size()[3]],mode='bilinear')
+        concat4 = torch.cat((conv4,self.upconv5[1](conv5x)),dim=1)
+        conv4 = self.iconv4(concat4)
+        conv4x = F.upsample(conv4, [rconv3.size()[2],rconv3.size()[3]],mode='bilinear')
+        concat3 = torch.cat((rconv3,self.upconv4[1](conv4x)),dim=1)
+        conv3 = self.iconv3(concat3)
+        conv3x = F.upsample(conv3, [pool1.size()[2],pool1.size()[3]],mode='bilinear')
+        concat2 = torch.cat((pool1,self.upconv3[1](conv3x)),dim=1)
+        conv2 = self.iconv2(concat2)
+        if self.is_proj:
+            proj6 = self.proj6(conv6)
+            proj5 = self.proj5(conv5)
+            proj4 = self.proj4(conv4)
+            proj3 = self.proj3(conv3)
+            proj2 = self.proj2(conv2)
+            return proj6,proj5,proj4,proj3,proj2
+        else:
+            return conv6, conv5, conv4, conv3, conv2
+class pspnet_s(nn.Module):
+    """
+    Modified PSPNet.  https://github.com/meetshah1995/pytorch-semseg/blob/master/ptsemseg/models/pspnet.py
+    """
+    def __init__(self, is_proj=True,groups=1):
+        super(pspnet_s, self).__init__()
+        self.inplanes = 32
+        self.is_proj = is_proj
+        # Encoder
+        self.convbnrelu1_1 = conv2DBatchNormRelu(in_channels=3, k_size=3, n_filters=16,
+                                                 padding=1, stride=2)
+        self.convbnrelu1_2 = conv2DBatchNormRelu(in_channels=16, k_size=3, n_filters=16,
+                                                 padding=1, stride=1)
+        self.convbnrelu1_3 = conv2DBatchNormRelu(in_channels=16, k_size=3, n_filters=32,
+                                                 padding=1, stride=1)
+        # Vanilla Residual Blocks
+        self.res_block3 = self._make_layer(residualBlock,64,1,stride=2)
+        self.res_block5 = self._make_layer(residualBlock,128,1,stride=2)
+        self.res_block6 = self._make_layer(residualBlock,128,1,stride=2)
+        self.res_block7 = self._make_layer(residualBlock,128,1,stride=2)
+        self.pyramid_pooling = pyramidPooling(128, levels=3)
+        # Iconvs
+        self.upconv6 = nn.Sequential(nn.Upsample(scale_factor=2),
+                                     conv2DBatchNormRelu(in_channels=128, k_size=3, n_filters=64,
+                                                 padding=1, stride=1))
+        self.iconv5 = conv2DBatchNormRelu(in_channels=192, k_size=3, n_filters=128,
+                                                 padding=1, stride=1)
+        self.upconv5 = nn.Sequential(nn.Upsample(scale_factor=2),
+                                     conv2DBatchNormRelu(in_channels=128, k_size=3, n_filters=64,
+                                                 padding=1, stride=1))
+        self.iconv4 = conv2DBatchNormRelu(in_channels=192, k_size=3, n_filters=128,
+                                                 padding=1, stride=1)
+        self.upconv4 = nn.Sequential(nn.Upsample(scale_factor=2),
+                                     conv2DBatchNormRelu(in_channels=128, k_size=3, n_filters=64,
+                                                 padding=1, stride=1))
+        self.iconv3 = conv2DBatchNormRelu(in_channels=128, k_size=3, n_filters=64,
+                                                 padding=1, stride=1)
+        #self.upconv3 = nn.Sequential(nn.Upsample(scale_factor=2),
+        #                             conv2DBatchNormRelu(in_channels=64, k_size=3, n_filters=32,
+        #                                         padding=1, stride=1))
+        #self.iconv2 = conv2DBatchNormRelu(in_channels=64, k_size=3, n_filters=64,
+        #                                         padding=1, stride=1)
+        if self.is_proj:
+            self.proj6 = conv2DBatchNormRelu(in_channels=128,k_size=1,n_filters=128//groups, padding=0,stride=1)
+            self.proj5 = conv2DBatchNormRelu(in_channels=128,k_size=1,n_filters=128//groups, padding=0,stride=1)
+            self.proj4 = conv2DBatchNormRelu(in_channels=128,k_size=1,n_filters=128//groups, padding=0,stride=1)
+            self.proj3 = conv2DBatchNormRelu(in_channels=64, k_size=1,n_filters=64//groups, padding=0,stride=1)
+            #self.proj2 = conv2DBatchNormRelu(in_channels=64, k_size=1,n_filters=64//groups, padding=0,stride=1)
+        for m in self.modules():
+            if isinstance(m, nn.Conv2d):
+                n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
+                m.weight.data.normal_(0, math.sqrt(2. / n))
+                if hasattr(m.bias,'data'):
+                    m.bias.data.zero_()
+    def _make_layer(self, block, planes, blocks, stride=1):
+        downsample = None
+        if stride != 1 or self.inplanes != planes * block.expansion:
+            downsample = nn.Sequential(nn.Conv2d(self.inplanes, planes * block.expansion,
+                                                 kernel_size=1, stride=stride, bias=False),
+                                       nn.BatchNorm2d(planes * block.expansion),)
+        layers = []
+        layers.append(block(self.inplanes, planes, stride, downsample))
+        self.inplanes = planes * block.expansion
+        for i in range(1, blocks):
+            layers.append(block(self.inplanes, planes))
+        return nn.Sequential(*layers)
+    def forward(self, x):
+        # H, W -> H/2, W/2
+        conv1 = self.convbnrelu1_1(x)
+        conv1 = self.convbnrelu1_2(conv1)
+        conv1 = self.convbnrelu1_3(conv1)
+        ## H/2, W/2 -> H/4, W/4
+        pool1 = F.max_pool2d(conv1, 3, 2, 1)
+        # H/4, W/4 -> H/16, W/16
+        rconv3 = self.res_block3(pool1)
+        conv4 = self.res_block5(rconv3)
+        conv5 = self.res_block6(conv4)
+        conv6 = self.res_block7(conv5)
+        conv6 = self.pyramid_pooling(conv6)
+        conv6x = F.upsample(conv6, [conv5.size()[2],conv5.size()[3]],mode='bilinear')
+        concat5 = torch.cat((conv5,self.upconv6[1](conv6x)),dim=1)
+        conv5 = self.iconv5(concat5)
+        conv5x = F.upsample(conv5, [conv4.size()[2],conv4.size()[3]],mode='bilinear')
+        concat4 = torch.cat((conv4,self.upconv5[1](conv5x)),dim=1)
+        conv4 = self.iconv4(concat4)
+        conv4x = F.upsample(conv4, [rconv3.size()[2],rconv3.size()[3]],mode='bilinear')
+        concat3 = torch.cat((rconv3,self.upconv4[1](conv4x)),dim=1)
+        conv3 = self.iconv3(concat3)
+        #conv3x = F.upsample(conv3, [pool1.size()[2],pool1.size()[3]],mode='bilinear')
+        #concat2 = torch.cat((pool1,self.upconv3[1](conv3x)),dim=1)
+        #conv2 = self.iconv2(concat2)
+        if self.is_proj:
+            proj6 = self.proj6(conv6)
+            proj5 = self.proj5(conv5)
+            proj4 = self.proj4(conv4)
+            proj3 = self.proj3(conv3)
+        #    proj2 = self.proj2(conv2)
+        #    return proj6,proj5,proj4,proj3,proj2
+            return proj6,proj5,proj4,proj3
+        else:
+        #    return conv6, conv5, conv4, conv3, conv2
+            return conv6, conv5, conv4, conv3
+class bfmodule(nn.Module):
+    def __init__(self, inplanes, outplanes):
+        super(bfmodule, self).__init__()
+        self.proj = conv2DBatchNormRelu(in_channels=inplanes,k_size=1,n_filters=64,padding=0,stride=1)
+        self.inplanes = 64
+        # Vanilla Residual Blocks
+        self.res_block3 = self._make_layer(residualBlock,64,1,stride=2)
+        self.res_block5 = self._make_layer(residualBlock,64,1,stride=2)
+        self.res_block6 = self._make_layer(residualBlock,64,1,stride=2)
+        self.res_block7 = self._make_layer(residualBlock,128,1,stride=2)
+        self.pyramid_pooling = pyramidPooling(128, levels=3)
+        # Iconvs
+        self.upconv6 = conv2DBatchNormRelu(in_channels=128, k_size=3, n_filters=64,
+                                                 padding=1, stride=1)
+        self.upconv5 = conv2DBatchNormRelu(in_channels=64, k_size=3, n_filters=32,
+                                                 padding=1, stride=1)
+        self.upconv4 = conv2DBatchNormRelu(in_channels=64, k_size=3, n_filters=32,
+                                                 padding=1, stride=1)
+        self.upconv3 = conv2DBatchNormRelu(in_channels=64, k_size=3, n_filters=32,
+                                                 padding=1, stride=1)
+        self.iconv5 = conv2DBatchNormRelu(in_channels=128, k_size=3, n_filters=64,
+                                                 padding=1, stride=1)
+        self.iconv4 = conv2DBatchNormRelu(in_channels=96, k_size=3, n_filters=64,
+                                                 padding=1, stride=1)
+        self.iconv3 = conv2DBatchNormRelu(in_channels=96, k_size=3, n_filters=64,
+                                                 padding=1, stride=1)
+        self.iconv2 = nn.Sequential(conv2DBatchNormRelu(in_channels=96, k_size=3, n_filters=64,
+                                                 padding=1, stride=1),
+                                    nn.Conv2d(64, outplanes,kernel_size=3, stride=1, padding=1, bias=True))
+        self.proj6 = nn.Conv2d(128, outplanes,kernel_size=3, stride=1, padding=1, bias=True)
+        self.proj5 = nn.Conv2d(64, outplanes,kernel_size=3, stride=1, padding=1, bias=True)
+        self.proj4 = nn.Conv2d(64, outplanes,kernel_size=3, stride=1, padding=1, bias=True)
+        self.proj3 = nn.Conv2d(64, outplanes,kernel_size=3, stride=1, padding=1, bias=True)
+        for m in self.modules():
+            if isinstance(m, nn.Conv2d):
+                n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
+                m.weight.data.normal_(0, math.sqrt(2. / n))
+                if hasattr(m.bias,'data'):
+                    m.bias.data.zero_()
+    def _make_layer(self, block, planes, blocks, stride=1):
+        downsample = None
+        if stride != 1 or self.inplanes != planes * block.expansion:
+            downsample = nn.Sequential(nn.Conv2d(self.inplanes, planes * block.expansion,
+                                                 kernel_size=1, stride=stride, bias=False),
+                                       nn.BatchNorm2d(planes * block.expansion),)
+        layers = []
+        layers.append(block(self.inplanes, planes, stride, downsample))
+        self.inplanes = planes * block.expansion
+        for i in range(1, blocks):
+            layers.append(block(self.inplanes, planes))
+        return nn.Sequential(*layers)
+    def forward(self, x):
+        proj = self.proj(x) # 4x
+        rconv3 = self.res_block3(proj) #8x
+        conv4 = self.res_block5(rconv3) #16x
+        conv5 = self.res_block6(conv4) #32x
+        conv6 = self.res_block7(conv5) #64x
+        conv6 = self.pyramid_pooling(conv6) #64x
+        pred6 = self.proj6(conv6)
+        conv6u = F.upsample(conv6, [conv5.size()[2],conv5.size()[3]], mode='bilinear')
+        concat5 = torch.cat((conv5,self.upconv6(conv6u)),dim=1)
+        conv5 = self.iconv5(concat5) #32x
+        pred5 = self.proj5(conv5)
+        conv5u = F.upsample(conv5, [conv4.size()[2],conv4.size()[3]], mode='bilinear')
+        concat4 = torch.cat((conv4,self.upconv5(conv5u)),dim=1)
+        conv4 = self.iconv4(concat4) #16x
+        pred4 = self.proj4(conv4)
+        conv4u = F.upsample(conv4, [rconv3.size()[2],rconv3.size()[3]], mode='bilinear')
+        concat3 = torch.cat((rconv3,self.upconv4(conv4u)),dim=1)
+        conv3 = self.iconv3(concat3) # 8x
+        pred3 = self.proj3(conv3)
+        conv3u = F.upsample(conv3, [x.size()[2],x.size()[3]], mode='bilinear')
+        concat2 = torch.cat((proj,self.upconv3(conv3u)),dim=1)
+        pred2 = self.iconv2(concat2)  # 4x
+        return pred2, pred3, pred4, pred5, pred6

expansion/submission.py ADDED Viewed

	@@ -0,0 +1,95 @@

+from __future__ import print_function
+import sys
+import cv2
+import argparse
+import numpy as np
+import torch
+import torch.nn as nn
+import torch.backends.cudnn as cudnn
+import torch.optim as optim
+import torch.nn.functional as F
+cudnn.benchmark = False
+class Expansion():
+    def __init__(self, loadmodel = 'pretrained_models/optical_expansion/robust.pth', testres = 1, maxdisp = 256, fac = 1):
+        maxw,maxh = [int(testres*1280), int(testres*384)]
+        max_h = int(maxh // 64 * 64)
+        max_w = int(maxw // 64 * 64)
+        if max_h < maxh: max_h += 64
+        if max_w < maxw: max_w += 64
+        maxh = max_h
+        maxw = max_w
+        mean_L = [[0.33,0.33,0.33]]
+        mean_R = [[0.33,0.33,0.33]]
+        # construct model, VCN-expansion
+        from expansion.models.VCN_exp import VCN
+        model = VCN([1, maxw, maxh], md=[int(4*(maxdisp/256)),4,4,4,4], fac=fac,
+          exp_unc=('robust' in loadmodel))  # expansion uncertainty only in the new model
+        model = nn.DataParallel(model, device_ids=[0])
+        model.cuda()
+        if loadmodel is not None:
+            pretrained_dict = torch.load(loadmodel)
+            mean_L=pretrained_dict['mean_L']
+            mean_R=pretrained_dict['mean_R']
+            pretrained_dict['state_dict'] =  {k:v for k,v in pretrained_dict['state_dict'].items()}
+            model.load_state_dict(pretrained_dict['state_dict'],strict=False)
+        else:
+            print('dry run')
+        model.eval()
+        # resize
+        maxh = 256
+        maxw = 256
+        max_h = int(maxh // 64 * 64)
+        max_w = int(maxw // 64 * 64)
+        if max_h < maxh: max_h += 64
+        if max_w < maxw: max_w += 64
+        # modify module according to inputs
+        from expansion.models.VCN_exp import WarpModule, flow_reg
+        for i in range(len(model.module.reg_modules)):
+            model.module.reg_modules[i] = flow_reg([1,max_w//(2**(6-i)), max_h//(2**(6-i))],
+                            ent=getattr(model.module, 'flow_reg%d'%2**(6-i)).ent,\
+                            maxdisp=getattr(model.module, 'flow_reg%d'%2**(6-i)).md,\
+                            fac=getattr(model.module, 'flow_reg%d'%2**(6-i)).fac).cuda()
+        for i in range(len(model.module.warp_modules)):
+            model.module.warp_modules[i] = WarpModule([1,max_w//(2**(6-i)), max_h//(2**(6-i))]).cuda()
+        mean_L = torch.from_numpy(np.asarray(mean_L).astype(np.float32).mean(0)[np.newaxis,:,np.newaxis,np.newaxis]).cuda()
+        mean_R = torch.from_numpy(np.asarray(mean_R).astype(np.float32).mean(0)[np.newaxis,:,np.newaxis,np.newaxis]).cuda()
+        self.max_h = max_h
+        self.max_w = max_w
+        self.model = model
+        self.mean_L = mean_L
+        self.mean_R = mean_R
+    def run(self, imgL_o, imgR_o):
+        model = self.model
+        mean_L = self.mean_L
+        mean_R = self.mean_R
+        imgL_o[imgL_o<-1] = -1
+        imgL_o[imgL_o>1] = 1
+        imgR_o[imgR_o<-1] = -1
+        imgR_o[imgR_o>1] = 1
+        imgL = (imgL_o+1.)*0.5-mean_L
+        imgR = (imgR_o*1.)*0.5-mean_R
+        with torch.no_grad():
+            imgLR = torch.cat([imgL,imgR],0)
+            model.eval()
+            torch.cuda.synchronize()
+            rts = model(imgLR)
+            torch.cuda.synchronize()
+            flow, occ, logmid, logexp = rts
+        torch.cuda.empty_cache()
+        return flow, logexp

expansion/utils/__init__.py ADDED Viewed

File without changes

expansion/utils/__pycache__/__init__.cpython-38.pyc ADDED Viewed

Binary file (157 Bytes). View file

expansion/utils/__pycache__/flowlib.cpython-38.pyc ADDED Viewed

Binary file (16 kB). View file

expansion/utils/__pycache__/io.cpython-38.pyc ADDED Viewed

Binary file (3.97 kB). View file

expansion/utils/__pycache__/pfm.cpython-38.pyc ADDED Viewed

Binary file (1.65 kB). View file