File size: 2,862 Bytes
42fa14a
 
 
 
 
 
8f6475d
 
42fa14a
 
 
 
f2a94e5
42fa14a
 
8a45673
d46ec21
c3cf389
5828b76
 
 
 
42fa14a
 
 
 
1f6af64
42fa14a
 
 
d46ec21
020c806
42fa14a
3aa8f64
42fa14a
 
 
 
 
020c806
 
a703759
42fa14a
 
 
 
 
 
 
 
 
1f6af64
42fa14a
1f6af64
42fa14a
 
 
 
 
1f6af64
 
 
 
 
42fa14a
 
 
0db21a1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
---
datasets:
- imagenet-1k
metrics:
- accuracy
library_name: timm
license: apache-2.0
pipeline_tag: image-classification
---

# Model Card for Model ID

Based on **quasi-linear hyperbolic systems of PDEs** [[Liu et al, 2023](https://github.com/liuyao12/ConvNets-PDE-perspective)], the QLNet makes an entry into uncharted waters of ConvNet model space marked by the use of (element-wise) multiplication in lieu of ReLU as the primary nonlinearity. It achieves comparable performance as ResNet50 on ImageNet-1k (acc=**78.4**), demonstrating that it has the same level of capacity/expressivity, and deserves more analysis and study (hyper-paremeter tuning, optimizer, etc.) by the academic community.


![](https://huggingface.co/liuyao/QLNet/resolve/main/PDE_perspective.jpeg)

One notable feature is that the architecture (trained or not) admits a *continuous* symmetry in its parameters. Check out the [notebook](https://colab.research.google.com/#fileId=https://huggingface.co/liuyao/QLNet/blob/main/QLNet_symmetry.ipynb) for a demo that makes a particular transformation on the weights while leaving the output *unchanged*.

*This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).*


## Model Details

### Model Description

Instead of the `bottleneck` block of ResNet50 which consists of 1x1, 3x3, 1x1 in succession, this simplest version of QLNet does a 1x1, splits into two equal halves and **multiplies** them, then applies a 3x3 (depthwise), and a 1x1, all *without* activation functions except at the end of the block, where a *radial activation function* that we call `hardball` is applied.



- **Developed by:** Yao Liu 刘杳
- **Model type:** Convolutional Neural Network (ConvNet)
- **License:** [More Information Needed]
- **Finetuned from model:** N/A (*trained from scratch*)

### Model Sources [optional]

<!-- Provide the basic links for the model. -->

- **Repository:** [ConvNet from the PDE perspective](https://github.com/liuyao12/ConvNets-PDE-perspective)
- **Paper:** [A Novel ConvNet Architecture with a Continuous Symmetry](https://arxiv.org/abs/2308.01621)
- **Demo:** [More Information Needed]

## How to Get Started with the Model

Use the code below to get started with the model.

[More Information Needed]

## Training Details

### Training and Testing Data

ImageNet-1k

[More Information Needed]

### Training Procedure 

We use the training script in `timm`

```
python3 train.py ../datasets/imagenet/ --model resnet50 --num-classes 1000 --lr 0.1 --warmup-epochs 5 --epochs 240 --weight-decay 1e-4 --sched cosine --reprob 0.4 --recount 3 --remode pixel --aa rand-m7-mstd0.5-inc1 -b 192 -j 6 --amp --dist-bn reduce 
```

### Results

qlnet-50-v0: acc=78.40