Abhilashvj commited on
Commit
5b2fcab
·
1 Parent(s): 1346345

Upload 250 files

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
.gitattributes CHANGED
@@ -25,3 +25,7 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
25
  *.zip filter=lfs diff=lfs merge=lfs -text
26
  *.zstandard filter=lfs diff=lfs merge=lfs -text
27
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
25
  *.zip filter=lfs diff=lfs merge=lfs -text
26
  *.zstandard filter=lfs diff=lfs merge=lfs -text
27
  *tfevents* filter=lfs diff=lfs merge=lfs -text
28
+ sample_master_planogram.jpeg filter=lfs diff=lfs merge=lfs -text
29
+ tmp.png filter=lfs diff=lfs merge=lfs -text
30
+ tmp/master_tmp.png filter=lfs diff=lfs merge=lfs -text
31
+ tmp/to_score_planogram_tmp.png filter=lfs diff=lfs merge=lfs -text
CONTRIBUTING.md ADDED
@@ -0,0 +1,94 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Contributing to YOLOv5 🚀
2
+
3
+ We love your input! We want to make contributing to YOLOv5 as easy and transparent as possible, whether it's:
4
+
5
+ - Reporting a bug
6
+ - Discussing the current state of the code
7
+ - Submitting a fix
8
+ - Proposing a new feature
9
+ - Becoming a maintainer
10
+
11
+ YOLOv5 works so well due to our combined community effort, and for every small improvement you contribute you will be
12
+ helping push the frontiers of what's possible in AI 😃!
13
+
14
+ ## Submitting a Pull Request (PR) 🛠️
15
+
16
+ Submitting a PR is easy! This example shows how to submit a PR for updating `requirements.txt` in 4 steps:
17
+
18
+ ### 1. Select File to Update
19
+
20
+ Select `requirements.txt` to update by clicking on it in GitHub.
21
+ <p align="center"><img width="800" alt="PR_step1" src="https://user-images.githubusercontent.com/26833433/122260847-08be2600-ced4-11eb-828b-8287ace4136c.png"></p>
22
+
23
+ ### 2. Click 'Edit this file'
24
+
25
+ Button is in top-right corner.
26
+ <p align="center"><img width="800" alt="PR_step2" src="https://user-images.githubusercontent.com/26833433/122260844-06f46280-ced4-11eb-9eec-b8a24be519ca.png"></p>
27
+
28
+ ### 3. Make Changes
29
+
30
+ Change `matplotlib` version from `3.2.2` to `3.3`.
31
+ <p align="center"><img width="800" alt="PR_step3" src="https://user-images.githubusercontent.com/26833433/122260853-0a87e980-ced4-11eb-9fd2-3650fb6e0842.png"></p>
32
+
33
+ ### 4. Preview Changes and Submit PR
34
+
35
+ Click on the **Preview changes** tab to verify your updates. At the bottom of the screen select 'Create a **new branch**
36
+ for this commit', assign your branch a descriptive name such as `fix/matplotlib_version` and click the green **Propose
37
+ changes** button. All done, your PR is now submitted to YOLOv5 for review and approval 😃!
38
+ <p align="center"><img width="800" alt="PR_step4" src="https://user-images.githubusercontent.com/26833433/122260856-0b208000-ced4-11eb-8e8e-77b6151cbcc3.png"></p>
39
+
40
+ ### PR recommendations
41
+
42
+ To allow your work to be integrated as seamlessly as possible, we advise you to:
43
+
44
+ - ✅ Verify your PR is **up-to-date with origin/master.** If your PR is behind origin/master an
45
+ automatic [GitHub actions](https://github.com/ultralytics/yolov5/blob/master/.github/workflows/rebase.yml) rebase may
46
+ be attempted by including the /rebase command in a comment body, or by running the following code, replacing 'feature'
47
+ with the name of your local branch:
48
+
49
+ ```bash
50
+ git remote add upstream https://github.com/ultralytics/yolov5.git
51
+ git fetch upstream
52
+ git checkout feature # <----- replace 'feature' with local branch name
53
+ git merge upstream/master
54
+ git push -u origin -f
55
+ ```
56
+
57
+ - ✅ Verify all Continuous Integration (CI) **checks are passing**.
58
+ - ✅ Reduce changes to the absolute **minimum** required for your bug fix or feature addition. _"It is not daily increase
59
+ but daily decrease, hack away the unessential. The closer to the source, the less wastage there is."_ -Bruce Lee
60
+
61
+ ## Submitting a Bug Report 🐛
62
+
63
+ If you spot a problem with YOLOv5 please submit a Bug Report!
64
+
65
+ For us to start investigating a possibel problem we need to be able to reproduce it ourselves first. We've created a few
66
+ short guidelines below to help users provide what we need in order to get started.
67
+
68
+ When asking a question, people will be better able to provide help if you provide **code** that they can easily
69
+ understand and use to **reproduce** the problem. This is referred to by community members as creating
70
+ a [minimum reproducible example](https://stackoverflow.com/help/minimal-reproducible-example). Your code that reproduces
71
+ the problem should be:
72
+
73
+ * ✅ **Minimal** – Use as little code as possible that still produces the same problem
74
+ * ✅ **Complete** – Provide **all** parts someone else needs to reproduce your problem in the question itself
75
+ * ✅ **Reproducible** – Test the code you're about to provide to make sure it reproduces the problem
76
+
77
+ In addition to the above requirements, for [Ultralytics](https://ultralytics.com/) to provide assistance your code
78
+ should be:
79
+
80
+ * ✅ **Current** – Verify that your code is up-to-date with current
81
+ GitHub [master](https://github.com/ultralytics/yolov5/tree/master), and if necessary `git pull` or `git clone` a new
82
+ copy to ensure your problem has not already been resolved by previous commits.
83
+ * ✅ **Unmodified** – Your problem must be reproducible without any modifications to the codebase in this
84
+ repository. [Ultralytics](https://ultralytics.com/) does not provide support for custom code ⚠️.
85
+
86
+ If you believe your problem meets all of the above criteria, please close this issue and raise a new one using the 🐛 **
87
+ Bug Report** [template](https://github.com/ultralytics/yolov5/issues/new/choose) and providing
88
+ a [minimum reproducible example](https://stackoverflow.com/help/minimal-reproducible-example) to help us better
89
+ understand and diagnose your problem.
90
+
91
+ ## License
92
+
93
+ By contributing, you agree that your contributions will be licensed under
94
+ the [GPL-3.0 license](https://choosealicense.com/licenses/gpl-3.0/)
Dockerfile ADDED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
2
+
3
+ # # Start FROM Nvidia PyTorch image https://ngc.nvidia.com/catalog/containers/nvidia:pytorch
4
+ # FROM nvcr.io/nvidia/pytorch:21.05-py3
5
+
6
+ # # Install linux packages
7
+ # RUN apt update && apt install -y zip htop screen libgl1-mesa-glx
8
+
9
+ # # Install python dependencies
10
+ # COPY requirements.txt .
11
+ # RUN python -m pip install --upgrade pip
12
+ # RUN pip uninstall -y nvidia-tensorboard nvidia-tensorboard-plugin-dlprof
13
+ # RUN pip install --no-cache -r requirements.txt coremltools onnx gsutil notebook
14
+ # RUN pip install --no-cache -U torch torchvision numpy
15
+ # # RUN pip install --no-cache torch==1.9.0+cu111 torchvision==0.10.0+cu111 -f https://download.pytorch.org/whl/torch_stable.html
16
+
17
+ # # Create working directory
18
+ # RUN mkdir -p /usr/src/app
19
+ # WORKDIR /usr/src/app
20
+
21
+ # # Copy contents
22
+ # COPY . /usr/src/app
23
+
24
+ # # Set environment variables
25
+ # ENV HOME=/usr/src/app
26
+
27
+
28
+ # Usage Examples -------------------------------------------------------------------------------------------------------
29
+
30
+ # Build and Push
31
+ # t=ultralytics/yolov5:latest && sudo docker build -t $t . && sudo docker push $t
32
+
33
+ # Pull and Run
34
+ # t=ultralytics/yolov5:latest && sudo docker pull $t && sudo docker run -it --ipc=host --gpus all $t
35
+
36
+ # Pull and Run with local directory access
37
+ # t=ultralytics/yolov5:latest && sudo docker pull $t && sudo docker run -it --ipc=host --gpus all -v "$(pwd)"/datasets:/usr/src/datasets $t
38
+
39
+ # Kill all
40
+ # sudo docker kill $(sudo docker ps -q)
41
+
42
+ # Kill all image-based
43
+ # sudo docker kill $(sudo docker ps -qa --filter ancestor=ultralytics/yolov5:latest)
44
+
45
+ # Bash into running container
46
+ # sudo docker exec -it 5a9b5863d93d bash
47
+
48
+ # Bash into stopped container
49
+ # id=$(sudo docker ps -qa) && sudo docker start $id && sudo docker exec -it $id bash
50
+
51
+ # Clean up
52
+ # docker system prune -a --volumes
53
+ FROM python:3.9
54
+ EXPOSE 8501
55
+ WORKDIR /app
56
+ COPY requirements.txt ./requirements.txt
57
+ RUN pip3 install -r requirements.txt
58
+ COPY . .
59
+ # CMD streamlit run app.py
60
+ CMD streamlit run --server.port $PORT app.py
LICENSE ADDED
@@ -0,0 +1,674 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ GNU GENERAL PUBLIC LICENSE
2
+ Version 3, 29 June 2007
3
+
4
+ Copyright (C) 2007 Free Software Foundation, Inc. <http://fsf.org/>
5
+ Everyone is permitted to copy and distribute verbatim copies
6
+ of this license document, but changing it is not allowed.
7
+
8
+ Preamble
9
+
10
+ The GNU General Public License is a free, copyleft license for
11
+ software and other kinds of works.
12
+
13
+ The licenses for most software and other practical works are designed
14
+ to take away your freedom to share and change the works. By contrast,
15
+ the GNU General Public License is intended to guarantee your freedom to
16
+ share and change all versions of a program--to make sure it remains free
17
+ software for all its users. We, the Free Software Foundation, use the
18
+ GNU General Public License for most of our software; it applies also to
19
+ any other work released this way by its authors. You can apply it to
20
+ your programs, too.
21
+
22
+ When we speak of free software, we are referring to freedom, not
23
+ price. Our General Public Licenses are designed to make sure that you
24
+ have the freedom to distribute copies of free software (and charge for
25
+ them if you wish), that you receive source code or can get it if you
26
+ want it, that you can change the software or use pieces of it in new
27
+ free programs, and that you know you can do these things.
28
+
29
+ To protect your rights, we need to prevent others from denying you
30
+ these rights or asking you to surrender the rights. Therefore, you have
31
+ certain responsibilities if you distribute copies of the software, or if
32
+ you modify it: responsibilities to respect the freedom of others.
33
+
34
+ For example, if you distribute copies of such a program, whether
35
+ gratis or for a fee, you must pass on to the recipients the same
36
+ freedoms that you received. You must make sure that they, too, receive
37
+ or can get the source code. And you must show them these terms so they
38
+ know their rights.
39
+
40
+ Developers that use the GNU GPL protect your rights with two steps:
41
+ (1) assert copyright on the software, and (2) offer you this License
42
+ giving you legal permission to copy, distribute and/or modify it.
43
+
44
+ For the developers' and authors' protection, the GPL clearly explains
45
+ that there is no warranty for this free software. For both users' and
46
+ authors' sake, the GPL requires that modified versions be marked as
47
+ changed, so that their problems will not be attributed erroneously to
48
+ authors of previous versions.
49
+
50
+ Some devices are designed to deny users access to install or run
51
+ modified versions of the software inside them, although the manufacturer
52
+ can do so. This is fundamentally incompatible with the aim of
53
+ protecting users' freedom to change the software. The systematic
54
+ pattern of such abuse occurs in the area of products for individuals to
55
+ use, which is precisely where it is most unacceptable. Therefore, we
56
+ have designed this version of the GPL to prohibit the practice for those
57
+ products. If such problems arise substantially in other domains, we
58
+ stand ready to extend this provision to those domains in future versions
59
+ of the GPL, as needed to protect the freedom of users.
60
+
61
+ Finally, every program is threatened constantly by software patents.
62
+ States should not allow patents to restrict development and use of
63
+ software on general-purpose computers, but in those that do, we wish to
64
+ avoid the special danger that patents applied to a free program could
65
+ make it effectively proprietary. To prevent this, the GPL assures that
66
+ patents cannot be used to render the program non-free.
67
+
68
+ The precise terms and conditions for copying, distribution and
69
+ modification follow.
70
+
71
+ TERMS AND CONDITIONS
72
+
73
+ 0. Definitions.
74
+
75
+ "This License" refers to version 3 of the GNU General Public License.
76
+
77
+ "Copyright" also means copyright-like laws that apply to other kinds of
78
+ works, such as semiconductor masks.
79
+
80
+ "The Program" refers to any copyrightable work licensed under this
81
+ License. Each licensee is addressed as "you". "Licensees" and
82
+ "recipients" may be individuals or organizations.
83
+
84
+ To "modify" a work means to copy from or adapt all or part of the work
85
+ in a fashion requiring copyright permission, other than the making of an
86
+ exact copy. The resulting work is called a "modified version" of the
87
+ earlier work or a work "based on" the earlier work.
88
+
89
+ A "covered work" means either the unmodified Program or a work based
90
+ on the Program.
91
+
92
+ To "propagate" a work means to do anything with it that, without
93
+ permission, would make you directly or secondarily liable for
94
+ infringement under applicable copyright law, except executing it on a
95
+ computer or modifying a private copy. Propagation includes copying,
96
+ distribution (with or without modification), making available to the
97
+ public, and in some countries other activities as well.
98
+
99
+ To "convey" a work means any kind of propagation that enables other
100
+ parties to make or receive copies. Mere interaction with a user through
101
+ a computer network, with no transfer of a copy, is not conveying.
102
+
103
+ An interactive user interface displays "Appropriate Legal Notices"
104
+ to the extent that it includes a convenient and prominently visible
105
+ feature that (1) displays an appropriate copyright notice, and (2)
106
+ tells the user that there is no warranty for the work (except to the
107
+ extent that warranties are provided), that licensees may convey the
108
+ work under this License, and how to view a copy of this License. If
109
+ the interface presents a list of user commands or options, such as a
110
+ menu, a prominent item in the list meets this criterion.
111
+
112
+ 1. Source Code.
113
+
114
+ The "source code" for a work means the preferred form of the work
115
+ for making modifications to it. "Object code" means any non-source
116
+ form of a work.
117
+
118
+ A "Standard Interface" means an interface that either is an official
119
+ standard defined by a recognized standards body, or, in the case of
120
+ interfaces specified for a particular programming language, one that
121
+ is widely used among developers working in that language.
122
+
123
+ The "System Libraries" of an executable work include anything, other
124
+ than the work as a whole, that (a) is included in the normal form of
125
+ packaging a Major Component, but which is not part of that Major
126
+ Component, and (b) serves only to enable use of the work with that
127
+ Major Component, or to implement a Standard Interface for which an
128
+ implementation is available to the public in source code form. A
129
+ "Major Component", in this context, means a major essential component
130
+ (kernel, window system, and so on) of the specific operating system
131
+ (if any) on which the executable work runs, or a compiler used to
132
+ produce the work, or an object code interpreter used to run it.
133
+
134
+ The "Corresponding Source" for a work in object code form means all
135
+ the source code needed to generate, install, and (for an executable
136
+ work) run the object code and to modify the work, including scripts to
137
+ control those activities. However, it does not include the work's
138
+ System Libraries, or general-purpose tools or generally available free
139
+ programs which are used unmodified in performing those activities but
140
+ which are not part of the work. For example, Corresponding Source
141
+ includes interface definition files associated with source files for
142
+ the work, and the source code for shared libraries and dynamically
143
+ linked subprograms that the work is specifically designed to require,
144
+ such as by intimate data communication or control flow between those
145
+ subprograms and other parts of the work.
146
+
147
+ The Corresponding Source need not include anything that users
148
+ can regenerate automatically from other parts of the Corresponding
149
+ Source.
150
+
151
+ The Corresponding Source for a work in source code form is that
152
+ same work.
153
+
154
+ 2. Basic Permissions.
155
+
156
+ All rights granted under this License are granted for the term of
157
+ copyright on the Program, and are irrevocable provided the stated
158
+ conditions are met. This License explicitly affirms your unlimited
159
+ permission to run the unmodified Program. The output from running a
160
+ covered work is covered by this License only if the output, given its
161
+ content, constitutes a covered work. This License acknowledges your
162
+ rights of fair use or other equivalent, as provided by copyright law.
163
+
164
+ You may make, run and propagate covered works that you do not
165
+ convey, without conditions so long as your license otherwise remains
166
+ in force. You may convey covered works to others for the sole purpose
167
+ of having them make modifications exclusively for you, or provide you
168
+ with facilities for running those works, provided that you comply with
169
+ the terms of this License in conveying all material for which you do
170
+ not control copyright. Those thus making or running the covered works
171
+ for you must do so exclusively on your behalf, under your direction
172
+ and control, on terms that prohibit them from making any copies of
173
+ your copyrighted material outside their relationship with you.
174
+
175
+ Conveying under any other circumstances is permitted solely under
176
+ the conditions stated below. Sublicensing is not allowed; section 10
177
+ makes it unnecessary.
178
+
179
+ 3. Protecting Users' Legal Rights From Anti-Circumvention Law.
180
+
181
+ No covered work shall be deemed part of an effective technological
182
+ measure under any applicable law fulfilling obligations under article
183
+ 11 of the WIPO copyright treaty adopted on 20 December 1996, or
184
+ similar laws prohibiting or restricting circumvention of such
185
+ measures.
186
+
187
+ When you convey a covered work, you waive any legal power to forbid
188
+ circumvention of technological measures to the extent such circumvention
189
+ is effected by exercising rights under this License with respect to
190
+ the covered work, and you disclaim any intention to limit operation or
191
+ modification of the work as a means of enforcing, against the work's
192
+ users, your or third parties' legal rights to forbid circumvention of
193
+ technological measures.
194
+
195
+ 4. Conveying Verbatim Copies.
196
+
197
+ You may convey verbatim copies of the Program's source code as you
198
+ receive it, in any medium, provided that you conspicuously and
199
+ appropriately publish on each copy an appropriate copyright notice;
200
+ keep intact all notices stating that this License and any
201
+ non-permissive terms added in accord with section 7 apply to the code;
202
+ keep intact all notices of the absence of any warranty; and give all
203
+ recipients a copy of this License along with the Program.
204
+
205
+ You may charge any price or no price for each copy that you convey,
206
+ and you may offer support or warranty protection for a fee.
207
+
208
+ 5. Conveying Modified Source Versions.
209
+
210
+ You may convey a work based on the Program, or the modifications to
211
+ produce it from the Program, in the form of source code under the
212
+ terms of section 4, provided that you also meet all of these conditions:
213
+
214
+ a) The work must carry prominent notices stating that you modified
215
+ it, and giving a relevant date.
216
+
217
+ b) The work must carry prominent notices stating that it is
218
+ released under this License and any conditions added under section
219
+ 7. This requirement modifies the requirement in section 4 to
220
+ "keep intact all notices".
221
+
222
+ c) You must license the entire work, as a whole, under this
223
+ License to anyone who comes into possession of a copy. This
224
+ License will therefore apply, along with any applicable section 7
225
+ additional terms, to the whole of the work, and all its parts,
226
+ regardless of how they are packaged. This License gives no
227
+ permission to license the work in any other way, but it does not
228
+ invalidate such permission if you have separately received it.
229
+
230
+ d) If the work has interactive user interfaces, each must display
231
+ Appropriate Legal Notices; however, if the Program has interactive
232
+ interfaces that do not display Appropriate Legal Notices, your
233
+ work need not make them do so.
234
+
235
+ A compilation of a covered work with other separate and independent
236
+ works, which are not by their nature extensions of the covered work,
237
+ and which are not combined with it such as to form a larger program,
238
+ in or on a volume of a storage or distribution medium, is called an
239
+ "aggregate" if the compilation and its resulting copyright are not
240
+ used to limit the access or legal rights of the compilation's users
241
+ beyond what the individual works permit. Inclusion of a covered work
242
+ in an aggregate does not cause this License to apply to the other
243
+ parts of the aggregate.
244
+
245
+ 6. Conveying Non-Source Forms.
246
+
247
+ You may convey a covered work in object code form under the terms
248
+ of sections 4 and 5, provided that you also convey the
249
+ machine-readable Corresponding Source under the terms of this License,
250
+ in one of these ways:
251
+
252
+ a) Convey the object code in, or embodied in, a physical product
253
+ (including a physical distribution medium), accompanied by the
254
+ Corresponding Source fixed on a durable physical medium
255
+ customarily used for software interchange.
256
+
257
+ b) Convey the object code in, or embodied in, a physical product
258
+ (including a physical distribution medium), accompanied by a
259
+ written offer, valid for at least three years and valid for as
260
+ long as you offer spare parts or customer support for that product
261
+ model, to give anyone who possesses the object code either (1) a
262
+ copy of the Corresponding Source for all the software in the
263
+ product that is covered by this License, on a durable physical
264
+ medium customarily used for software interchange, for a price no
265
+ more than your reasonable cost of physically performing this
266
+ conveying of source, or (2) access to copy the
267
+ Corresponding Source from a network server at no charge.
268
+
269
+ c) Convey individual copies of the object code with a copy of the
270
+ written offer to provide the Corresponding Source. This
271
+ alternative is allowed only occasionally and noncommercially, and
272
+ only if you received the object code with such an offer, in accord
273
+ with subsection 6b.
274
+
275
+ d) Convey the object code by offering access from a designated
276
+ place (gratis or for a charge), and offer equivalent access to the
277
+ Corresponding Source in the same way through the same place at no
278
+ further charge. You need not require recipients to copy the
279
+ Corresponding Source along with the object code. If the place to
280
+ copy the object code is a network server, the Corresponding Source
281
+ may be on a different server (operated by you or a third party)
282
+ that supports equivalent copying facilities, provided you maintain
283
+ clear directions next to the object code saying where to find the
284
+ Corresponding Source. Regardless of what server hosts the
285
+ Corresponding Source, you remain obligated to ensure that it is
286
+ available for as long as needed to satisfy these requirements.
287
+
288
+ e) Convey the object code using peer-to-peer transmission, provided
289
+ you inform other peers where the object code and Corresponding
290
+ Source of the work are being offered to the general public at no
291
+ charge under subsection 6d.
292
+
293
+ A separable portion of the object code, whose source code is excluded
294
+ from the Corresponding Source as a System Library, need not be
295
+ included in conveying the object code work.
296
+
297
+ A "User Product" is either (1) a "consumer product", which means any
298
+ tangible personal property which is normally used for personal, family,
299
+ or household purposes, or (2) anything designed or sold for incorporation
300
+ into a dwelling. In determining whether a product is a consumer product,
301
+ doubtful cases shall be resolved in favor of coverage. For a particular
302
+ product received by a particular user, "normally used" refers to a
303
+ typical or common use of that class of product, regardless of the status
304
+ of the particular user or of the way in which the particular user
305
+ actually uses, or expects or is expected to use, the product. A product
306
+ is a consumer product regardless of whether the product has substantial
307
+ commercial, industrial or non-consumer uses, unless such uses represent
308
+ the only significant mode of use of the product.
309
+
310
+ "Installation Information" for a User Product means any methods,
311
+ procedures, authorization keys, or other information required to install
312
+ and execute modified versions of a covered work in that User Product from
313
+ a modified version of its Corresponding Source. The information must
314
+ suffice to ensure that the continued functioning of the modified object
315
+ code is in no case prevented or interfered with solely because
316
+ modification has been made.
317
+
318
+ If you convey an object code work under this section in, or with, or
319
+ specifically for use in, a User Product, and the conveying occurs as
320
+ part of a transaction in which the right of possession and use of the
321
+ User Product is transferred to the recipient in perpetuity or for a
322
+ fixed term (regardless of how the transaction is characterized), the
323
+ Corresponding Source conveyed under this section must be accompanied
324
+ by the Installation Information. But this requirement does not apply
325
+ if neither you nor any third party retains the ability to install
326
+ modified object code on the User Product (for example, the work has
327
+ been installed in ROM).
328
+
329
+ The requirement to provide Installation Information does not include a
330
+ requirement to continue to provide support service, warranty, or updates
331
+ for a work that has been modified or installed by the recipient, or for
332
+ the User Product in which it has been modified or installed. Access to a
333
+ network may be denied when the modification itself materially and
334
+ adversely affects the operation of the network or violates the rules and
335
+ protocols for communication across the network.
336
+
337
+ Corresponding Source conveyed, and Installation Information provided,
338
+ in accord with this section must be in a format that is publicly
339
+ documented (and with an implementation available to the public in
340
+ source code form), and must require no special password or key for
341
+ unpacking, reading or copying.
342
+
343
+ 7. Additional Terms.
344
+
345
+ "Additional permissions" are terms that supplement the terms of this
346
+ License by making exceptions from one or more of its conditions.
347
+ Additional permissions that are applicable to the entire Program shall
348
+ be treated as though they were included in this License, to the extent
349
+ that they are valid under applicable law. If additional permissions
350
+ apply only to part of the Program, that part may be used separately
351
+ under those permissions, but the entire Program remains governed by
352
+ this License without regard to the additional permissions.
353
+
354
+ When you convey a copy of a covered work, you may at your option
355
+ remove any additional permissions from that copy, or from any part of
356
+ it. (Additional permissions may be written to require their own
357
+ removal in certain cases when you modify the work.) You may place
358
+ additional permissions on material, added by you to a covered work,
359
+ for which you have or can give appropriate copyright permission.
360
+
361
+ Notwithstanding any other provision of this License, for material you
362
+ add to a covered work, you may (if authorized by the copyright holders of
363
+ that material) supplement the terms of this License with terms:
364
+
365
+ a) Disclaiming warranty or limiting liability differently from the
366
+ terms of sections 15 and 16 of this License; or
367
+
368
+ b) Requiring preservation of specified reasonable legal notices or
369
+ author attributions in that material or in the Appropriate Legal
370
+ Notices displayed by works containing it; or
371
+
372
+ c) Prohibiting misrepresentation of the origin of that material, or
373
+ requiring that modified versions of such material be marked in
374
+ reasonable ways as different from the original version; or
375
+
376
+ d) Limiting the use for publicity purposes of names of licensors or
377
+ authors of the material; or
378
+
379
+ e) Declining to grant rights under trademark law for use of some
380
+ trade names, trademarks, or service marks; or
381
+
382
+ f) Requiring indemnification of licensors and authors of that
383
+ material by anyone who conveys the material (or modified versions of
384
+ it) with contractual assumptions of liability to the recipient, for
385
+ any liability that these contractual assumptions directly impose on
386
+ those licensors and authors.
387
+
388
+ All other non-permissive additional terms are considered "further
389
+ restrictions" within the meaning of section 10. If the Program as you
390
+ received it, or any part of it, contains a notice stating that it is
391
+ governed by this License along with a term that is a further
392
+ restriction, you may remove that term. If a license document contains
393
+ a further restriction but permits relicensing or conveying under this
394
+ License, you may add to a covered work material governed by the terms
395
+ of that license document, provided that the further restriction does
396
+ not survive such relicensing or conveying.
397
+
398
+ If you add terms to a covered work in accord with this section, you
399
+ must place, in the relevant source files, a statement of the
400
+ additional terms that apply to those files, or a notice indicating
401
+ where to find the applicable terms.
402
+
403
+ Additional terms, permissive or non-permissive, may be stated in the
404
+ form of a separately written license, or stated as exceptions;
405
+ the above requirements apply either way.
406
+
407
+ 8. Termination.
408
+
409
+ You may not propagate or modify a covered work except as expressly
410
+ provided under this License. Any attempt otherwise to propagate or
411
+ modify it is void, and will automatically terminate your rights under
412
+ this License (including any patent licenses granted under the third
413
+ paragraph of section 11).
414
+
415
+ However, if you cease all violation of this License, then your
416
+ license from a particular copyright holder is reinstated (a)
417
+ provisionally, unless and until the copyright holder explicitly and
418
+ finally terminates your license, and (b) permanently, if the copyright
419
+ holder fails to notify you of the violation by some reasonable means
420
+ prior to 60 days after the cessation.
421
+
422
+ Moreover, your license from a particular copyright holder is
423
+ reinstated permanently if the copyright holder notifies you of the
424
+ violation by some reasonable means, this is the first time you have
425
+ received notice of violation of this License (for any work) from that
426
+ copyright holder, and you cure the violation prior to 30 days after
427
+ your receipt of the notice.
428
+
429
+ Termination of your rights under this section does not terminate the
430
+ licenses of parties who have received copies or rights from you under
431
+ this License. If your rights have been terminated and not permanently
432
+ reinstated, you do not qualify to receive new licenses for the same
433
+ material under section 10.
434
+
435
+ 9. Acceptance Not Required for Having Copies.
436
+
437
+ You are not required to accept this License in order to receive or
438
+ run a copy of the Program. Ancillary propagation of a covered work
439
+ occurring solely as a consequence of using peer-to-peer transmission
440
+ to receive a copy likewise does not require acceptance. However,
441
+ nothing other than this License grants you permission to propagate or
442
+ modify any covered work. These actions infringe copyright if you do
443
+ not accept this License. Therefore, by modifying or propagating a
444
+ covered work, you indicate your acceptance of this License to do so.
445
+
446
+ 10. Automatic Licensing of Downstream Recipients.
447
+
448
+ Each time you convey a covered work, the recipient automatically
449
+ receives a license from the original licensors, to run, modify and
450
+ propagate that work, subject to this License. You are not responsible
451
+ for enforcing compliance by third parties with this License.
452
+
453
+ An "entity transaction" is a transaction transferring control of an
454
+ organization, or substantially all assets of one, or subdividing an
455
+ organization, or merging organizations. If propagation of a covered
456
+ work results from an entity transaction, each party to that
457
+ transaction who receives a copy of the work also receives whatever
458
+ licenses to the work the party's predecessor in interest had or could
459
+ give under the previous paragraph, plus a right to possession of the
460
+ Corresponding Source of the work from the predecessor in interest, if
461
+ the predecessor has it or can get it with reasonable efforts.
462
+
463
+ You may not impose any further restrictions on the exercise of the
464
+ rights granted or affirmed under this License. For example, you may
465
+ not impose a license fee, royalty, or other charge for exercise of
466
+ rights granted under this License, and you may not initiate litigation
467
+ (including a cross-claim or counterclaim in a lawsuit) alleging that
468
+ any patent claim is infringed by making, using, selling, offering for
469
+ sale, or importing the Program or any portion of it.
470
+
471
+ 11. Patents.
472
+
473
+ A "contributor" is a copyright holder who authorizes use under this
474
+ License of the Program or a work on which the Program is based. The
475
+ work thus licensed is called the contributor's "contributor version".
476
+
477
+ A contributor's "essential patent claims" are all patent claims
478
+ owned or controlled by the contributor, whether already acquired or
479
+ hereafter acquired, that would be infringed by some manner, permitted
480
+ by this License, of making, using, or selling its contributor version,
481
+ but do not include claims that would be infringed only as a
482
+ consequence of further modification of the contributor version. For
483
+ purposes of this definition, "control" includes the right to grant
484
+ patent sublicenses in a manner consistent with the requirements of
485
+ this License.
486
+
487
+ Each contributor grants you a non-exclusive, worldwide, royalty-free
488
+ patent license under the contributor's essential patent claims, to
489
+ make, use, sell, offer for sale, import and otherwise run, modify and
490
+ propagate the contents of its contributor version.
491
+
492
+ In the following three paragraphs, a "patent license" is any express
493
+ agreement or commitment, however denominated, not to enforce a patent
494
+ (such as an express permission to practice a patent or covenant not to
495
+ sue for patent infringement). To "grant" such a patent license to a
496
+ party means to make such an agreement or commitment not to enforce a
497
+ patent against the party.
498
+
499
+ If you convey a covered work, knowingly relying on a patent license,
500
+ and the Corresponding Source of the work is not available for anyone
501
+ to copy, free of charge and under the terms of this License, through a
502
+ publicly available network server or other readily accessible means,
503
+ then you must either (1) cause the Corresponding Source to be so
504
+ available, or (2) arrange to deprive yourself of the benefit of the
505
+ patent license for this particular work, or (3) arrange, in a manner
506
+ consistent with the requirements of this License, to extend the patent
507
+ license to downstream recipients. "Knowingly relying" means you have
508
+ actual knowledge that, but for the patent license, your conveying the
509
+ covered work in a country, or your recipient's use of the covered work
510
+ in a country, would infringe one or more identifiable patents in that
511
+ country that you have reason to believe are valid.
512
+
513
+ If, pursuant to or in connection with a single transaction or
514
+ arrangement, you convey, or propagate by procuring conveyance of, a
515
+ covered work, and grant a patent license to some of the parties
516
+ receiving the covered work authorizing them to use, propagate, modify
517
+ or convey a specific copy of the covered work, then the patent license
518
+ you grant is automatically extended to all recipients of the covered
519
+ work and works based on it.
520
+
521
+ A patent license is "discriminatory" if it does not include within
522
+ the scope of its coverage, prohibits the exercise of, or is
523
+ conditioned on the non-exercise of one or more of the rights that are
524
+ specifically granted under this License. You may not convey a covered
525
+ work if you are a party to an arrangement with a third party that is
526
+ in the business of distributing software, under which you make payment
527
+ to the third party based on the extent of your activity of conveying
528
+ the work, and under which the third party grants, to any of the
529
+ parties who would receive the covered work from you, a discriminatory
530
+ patent license (a) in connection with copies of the covered work
531
+ conveyed by you (or copies made from those copies), or (b) primarily
532
+ for and in connection with specific products or compilations that
533
+ contain the covered work, unless you entered into that arrangement,
534
+ or that patent license was granted, prior to 28 March 2007.
535
+
536
+ Nothing in this License shall be construed as excluding or limiting
537
+ any implied license or other defenses to infringement that may
538
+ otherwise be available to you under applicable patent law.
539
+
540
+ 12. No Surrender of Others' Freedom.
541
+
542
+ If conditions are imposed on you (whether by court order, agreement or
543
+ otherwise) that contradict the conditions of this License, they do not
544
+ excuse you from the conditions of this License. If you cannot convey a
545
+ covered work so as to satisfy simultaneously your obligations under this
546
+ License and any other pertinent obligations, then as a consequence you may
547
+ not convey it at all. For example, if you agree to terms that obligate you
548
+ to collect a royalty for further conveying from those to whom you convey
549
+ the Program, the only way you could satisfy both those terms and this
550
+ License would be to refrain entirely from conveying the Program.
551
+
552
+ 13. Use with the GNU Affero General Public License.
553
+
554
+ Notwithstanding any other provision of this License, you have
555
+ permission to link or combine any covered work with a work licensed
556
+ under version 3 of the GNU Affero General Public License into a single
557
+ combined work, and to convey the resulting work. The terms of this
558
+ License will continue to apply to the part which is the covered work,
559
+ but the special requirements of the GNU Affero General Public License,
560
+ section 13, concerning interaction through a network will apply to the
561
+ combination as such.
562
+
563
+ 14. Revised Versions of this License.
564
+
565
+ The Free Software Foundation may publish revised and/or new versions of
566
+ the GNU General Public License from time to time. Such new versions will
567
+ be similar in spirit to the present version, but may differ in detail to
568
+ address new problems or concerns.
569
+
570
+ Each version is given a distinguishing version number. If the
571
+ Program specifies that a certain numbered version of the GNU General
572
+ Public License "or any later version" applies to it, you have the
573
+ option of following the terms and conditions either of that numbered
574
+ version or of any later version published by the Free Software
575
+ Foundation. If the Program does not specify a version number of the
576
+ GNU General Public License, you may choose any version ever published
577
+ by the Free Software Foundation.
578
+
579
+ If the Program specifies that a proxy can decide which future
580
+ versions of the GNU General Public License can be used, that proxy's
581
+ public statement of acceptance of a version permanently authorizes you
582
+ to choose that version for the Program.
583
+
584
+ Later license versions may give you additional or different
585
+ permissions. However, no additional obligations are imposed on any
586
+ author or copyright holder as a result of your choosing to follow a
587
+ later version.
588
+
589
+ 15. Disclaimer of Warranty.
590
+
591
+ THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
592
+ APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
593
+ HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
594
+ OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
595
+ THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
596
+ PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
597
+ IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
598
+ ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
599
+
600
+ 16. Limitation of Liability.
601
+
602
+ IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
603
+ WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
604
+ THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
605
+ GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
606
+ USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
607
+ DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
608
+ PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
609
+ EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
610
+ SUCH DAMAGES.
611
+
612
+ 17. Interpretation of Sections 15 and 16.
613
+
614
+ If the disclaimer of warranty and limitation of liability provided
615
+ above cannot be given local legal effect according to their terms,
616
+ reviewing courts shall apply local law that most closely approximates
617
+ an absolute waiver of all civil liability in connection with the
618
+ Program, unless a warranty or assumption of liability accompanies a
619
+ copy of the Program in return for a fee.
620
+
621
+ END OF TERMS AND CONDITIONS
622
+
623
+ How to Apply These Terms to Your New Programs
624
+
625
+ If you develop a new program, and you want it to be of the greatest
626
+ possible use to the public, the best way to achieve this is to make it
627
+ free software which everyone can redistribute and change under these terms.
628
+
629
+ To do so, attach the following notices to the program. It is safest
630
+ to attach them to the start of each source file to most effectively
631
+ state the exclusion of warranty; and each file should have at least
632
+ the "copyright" line and a pointer to where the full notice is found.
633
+
634
+ <one line to give the program's name and a brief idea of what it does.>
635
+ Copyright (C) <year> <name of author>
636
+
637
+ This program is free software: you can redistribute it and/or modify
638
+ it under the terms of the GNU General Public License as published by
639
+ the Free Software Foundation, either version 3 of the License, or
640
+ (at your option) any later version.
641
+
642
+ This program is distributed in the hope that it will be useful,
643
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
644
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
645
+ GNU General Public License for more details.
646
+
647
+ You should have received a copy of the GNU General Public License
648
+ along with this program. If not, see <http://www.gnu.org/licenses/>.
649
+
650
+ Also add information on how to contact you by electronic and paper mail.
651
+
652
+ If the program does terminal interaction, make it output a short
653
+ notice like this when it starts in an interactive mode:
654
+
655
+ <program> Copyright (C) <year> <name of author>
656
+ This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
657
+ This is free software, and you are welcome to redistribute it
658
+ under certain conditions; type `show c' for details.
659
+
660
+ The hypothetical commands `show w' and `show c' should show the appropriate
661
+ parts of the General Public License. Of course, your program's commands
662
+ might be different; for a GUI interface, you would use an "about box".
663
+
664
+ You should also get your employer (if you work as a programmer) or school,
665
+ if any, to sign a "copyright disclaimer" for the program, if necessary.
666
+ For more information on this, and how to apply and follow the GNU GPL, see
667
+ <http://www.gnu.org/licenses/>.
668
+
669
+ The GNU General Public License does not permit incorporating your program
670
+ into proprietary programs. If your program is a subroutine library, you
671
+ may consider it more useful to permit linking proprietary applications with
672
+ the library. If this is what you want to do, use the GNU Lesser General
673
+ Public License instead of this License. But first, please read
674
+ <http://www.gnu.org/philosophy/why-not-lgpl.html>.
Planogram_compliance_inference.ipynb ADDED
The diff for this file is too large to render. See raw diff
 
Procfile ADDED
@@ -0,0 +1 @@
 
 
1
+ web: sh setup.sh && streamlit run app.py
README.md CHANGED
@@ -1,46 +1,159 @@
1
- ---
2
- title: Planogram Compliance
3
- emoji: 👁
4
- colorFrom: gray
5
- colorTo: pink
6
- sdk: streamlit
7
- app_file: app.py
8
- pinned: false
9
- license: apache-2.0
10
- ---
11
 
12
- # Configuration
 
 
 
 
 
 
13
 
14
- `title`: _string_
15
- Display title for the Space
16
 
17
- `emoji`: _string_
18
- Space emoji (emoji-only character allowed)
 
 
 
19
 
20
- `colorFrom`: _string_
21
- Color for Thumbnail gradient (red, yellow, green, blue, indigo, purple, pink, gray)
22
 
23
- `colorTo`: _string_
24
- Color for Thumbnail gradient (red, yellow, green, blue, indigo, purple, pink, gray)
25
 
26
- `sdk`: _string_
27
- Can be either `gradio`, `streamlit`, or `static`
28
 
29
- `sdk_version` : _string_
30
- Only applicable for `streamlit` SDK.
31
- See [doc](https://hf.co/docs/hub/spaces) for more info on supported versions.
32
 
33
- `app_file`: _string_
34
- Path to your main application file (which contains either `gradio` or `streamlit` Python code, or `static` html code).
35
- Path is relative to the root of the repository.
36
 
37
- `models`: _List[string]_
38
- HF model IDs (like "gpt2" or "deepset/roberta-base-squad2") used in the Space.
39
- Will be parsed automatically from your code if not specified here.
 
40
 
41
- `datasets`: _List[string]_
42
- HF dataset IDs (like "common_voice" or "oscar-corpus/OSCAR-2109") used in the Space.
43
- Will be parsed automatically from your code if not specified here.
 
 
44
 
45
- `pinned`: _boolean_
46
- Whether the Space stays on top of your list.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## <div align="center">Planogram Scoring</div>
2
+ <p>
 
 
 
 
 
 
 
 
3
 
4
+ </p>
5
+ - Train a Yolo Model on the available products in our data base to detect them on a shelf
6
+ - https://wandb.ai/abhilash001vj/YOLOv5/runs/1v6yh7nk?workspace=user-abhilash001vj
7
+ - Have the master planogram data captured as a matrix of products encoded as numbers (label encoding by looking the products names saved in a list of all - the available product names )
8
+ - Detect the products on real images from stores.
9
+ - Arrange the detected products in the captured photograph to rows and columns
10
+ - Compare the product arrangement of captured photograph to the existing master planogram and produce the compliance score for correctly placed products
11
 
12
+ </div>
 
13
 
14
+ ## <div align="center">YOLOv5</div>
15
+ <p>
16
+ YOLOv5 🚀 is a family of object detection architectures and models pretrained on the COCO dataset, and represents <a href="https://ultralytics.com">Ultralytics</a>
17
+ open-source research into future vision AI methods, incorporating lessons learned and best practices evolved over thousands of hours of research and development.
18
+ </p>
19
 
20
+ </div>
 
21
 
22
+ ## <div align="center">Documentation</div>
 
23
 
24
+ See the [YOLOv5 Docs](https://docs.ultralytics.com) for full documentation on training, testing and deployment.
 
25
 
26
+ ## <div align="center">Quick Start Examples</div>
 
 
27
 
28
+ <details open>
29
+ <summary>Install</summary>
 
30
 
31
+ [**Python>=3.6.0**](https://www.python.org/) is required with all
32
+ [requirements.txt](https://github.com/ultralytics/yolov5/blob/master/requirements.txt) installed including
33
+ [**PyTorch>=1.7**](https://pytorch.org/get-started/locally/):
34
+ <!-- $ sudo apt update && apt install -y libgl1-mesa-glx libsm6 libxext6 libxrender-dev -->
35
 
36
+ ```bash
37
+ $ git clone https://github.com/ultralytics/yolov5
38
+ $ cd yolov5
39
+ $ pip install -r requirements.txt
40
+ ```
41
 
42
+ </details>
43
+
44
+ <details open>
45
+ <summary>Inference</summary>
46
+
47
+ Inference with YOLOv5 and [PyTorch Hub](https://github.com/ultralytics/yolov5/issues/36). Models automatically download
48
+ from the [latest YOLOv5 release](https://github.com/ultralytics/yolov5/releases).
49
+
50
+ ```python
51
+ import torch
52
+
53
+ # Model
54
+ model = torch.hub.load('ultralytics/yolov5', 'yolov5s') # or yolov5m, yolov5l, yolov5x, custom
55
+
56
+ # Images
57
+ img = 'https://ultralytics.com/images/zidane.jpg' # or file, Path, PIL, OpenCV, numpy, list
58
+
59
+ # Inference
60
+ results = model(img)
61
+
62
+ # Results
63
+ results.print() # or .show(), .save(), .crop(), .pandas(), etc.
64
+ ```
65
+
66
+ </details>
67
+
68
+
69
+ ## <div align="center">Why YOLOv5</div>
70
+
71
+ <p align="center"><img width="800" src="https://user-images.githubusercontent.com/26833433/114313216-f0a5e100-9af5-11eb-8445-c682b60da2e3.png"></p>
72
+ <details>
73
+ <summary>YOLOv5-P5 640 Figure (click to expand)</summary>
74
+
75
+ <p align="center"><img width="800" src="https://user-images.githubusercontent.com/26833433/114313219-f1d70e00-9af5-11eb-9973-52b1f98d321a.png"></p>
76
+ </details>
77
+ <details>
78
+ <summary>Figure Notes (click to expand)</summary>
79
+
80
+ * GPU Speed measures end-to-end time per image averaged over 5000 COCO val2017 images using a V100 GPU with batch size
81
+ 32, and includes image preprocessing, PyTorch FP16 inference, postprocessing and NMS.
82
+ * EfficientDet data from [google/automl](https://github.com/google/automl) at batch size 8.
83
+ * **Reproduce** by
84
+ `python val.py --task study --data coco.yaml --iou 0.7 --weights yolov5s6.pt yolov5m6.pt yolov5l6.pt yolov5x6.pt`
85
+
86
+ </details>
87
+
88
+ ### Pretrained Checkpoints
89
+
90
+ [assets]: https://github.com/ultralytics/yolov5/releases
91
+
92
+ |Model |size<br><sup>(pixels) |mAP<sup>val<br>0.5:0.95 |mAP<sup>test<br>0.5:0.95 |mAP<sup>val<br>0.5 |Speed<br><sup>V100 (ms) | |params<br><sup>(M) |FLOPs<br><sup>640 (B)
93
+ |--- |--- |--- |--- |--- |--- |---|--- |---
94
+ |[YOLOv5s][assets] |640 |36.7 |36.7 |55.4 |**2.0** | |7.3 |17.0
95
+ |[YOLOv5m][assets] |640 |44.5 |44.5 |63.1 |2.7 | |21.4 |51.3
96
+ |[YOLOv5l][assets] |640 |48.2 |48.2 |66.9 |3.8 | |47.0 |115.4
97
+ |[YOLOv5x][assets] |640 |**50.4** |**50.4** |**68.8** |6.1 | |87.7 |218.8
98
+ | | | | | | | | |
99
+ |[YOLOv5s6][assets] |1280 |43.3 |43.3 |61.9 |**4.3** | |12.7 |17.4
100
+ |[YOLOv5m6][assets] |1280 |50.5 |50.5 |68.7 |8.4 | |35.9 |52.4
101
+ |[YOLOv5l6][assets] |1280 |53.4 |53.4 |71.1 |12.3 | |77.2 |117.7
102
+ |[YOLOv5x6][assets] |1280 |**54.4** |**54.4** |**72.0** |22.4 | |141.8 |222.9
103
+ | | | | | | | | |
104
+ |[YOLOv5x6][assets] TTA |1280 |**55.0** |**55.0** |**72.0** |70.8 | |- |-
105
+
106
+ <details>
107
+ <summary>Table Notes (click to expand)</summary>
108
+
109
+ * AP<sup>test</sup> denotes COCO [test-dev2017](http://cocodataset.org/#upload) server results, all other AP results
110
+ denote val2017 accuracy.
111
+ * AP values are for single-model single-scale unless otherwise noted. **Reproduce mAP**
112
+ by `python val.py --data coco.yaml --img 640 --conf 0.001 --iou 0.65`
113
+ * Speed<sub>GPU</sub> averaged over 5000 COCO val2017 images using a
114
+ GCP [n1-standard-16](https://cloud.google.com/compute/docs/machine-types#n1_standard_machine_types) V100 instance, and
115
+ includes FP16 inference, postprocessing and NMS. **Reproduce speed**
116
+ by `python val.py --data coco.yaml --img 640 --conf 0.25 --iou 0.45 --half`
117
+ * All checkpoints are trained to 300 epochs with default settings and hyperparameters (no autoaugmentation).
118
+ * Test Time Augmentation ([TTA](https://github.com/ultralytics/yolov5/issues/303)) includes reflection and scale
119
+ augmentation. **Reproduce TTA** by `python val.py --data coco.yaml --img 1536 --iou 0.7 --augment`
120
+
121
+ </details>
122
+
123
+ ## <div align="center">Contribute</div>
124
+
125
+ We love your input! We want to make contributing to YOLOv5 as easy and transparent as possible. Please see
126
+ our [Contributing Guide](CONTRIBUTING.md) to get started.
127
+
128
+ ## <div align="center">Contact</div>
129
+
130
+ For issues running YOLOv5 please visit [GitHub Issues](https://github.com/ultralytics/yolov5/issues). For business or
131
+ professional support requests please visit [https://ultralytics.com/contact](https://ultralytics.com/contact).
132
+
133
+ <br>
134
+
135
+ <div align="center">
136
+ <a href="https://github.com/ultralytics">
137
+ <img src="https://github.com/ultralytics/yolov5/releases/download/v1.0/logo-social-github.png" width="3%"/>
138
+ </a>
139
+ <img width="3%" />
140
+ <a href="https://www.linkedin.com/company/ultralytics">
141
+ <img src="https://github.com/ultralytics/yolov5/releases/download/v1.0/logo-social-linkedin.png" width="3%"/>
142
+ </a>
143
+ <img width="3%" />
144
+ <a href="https://twitter.com/ultralytics">
145
+ <img src="https://github.com/ultralytics/yolov5/releases/download/v1.0/logo-social-twitter.png" width="3%"/>
146
+ </a>
147
+ <img width="3%" />
148
+ <a href="https://youtube.com/ultralytics">
149
+ <img src="https://github.com/ultralytics/yolov5/releases/download/v1.0/logo-social-youtube.png" width="3%"/>
150
+ </a>
151
+ <img width="3%" />
152
+ <a href="https://www.facebook.com/ultralytics">
153
+ <img src="https://github.com/ultralytics/yolov5/releases/download/v1.0/logo-social-facebook.png" width="3%"/>
154
+ </a>
155
+ <img width="3%" />
156
+ <a href="https://www.instagram.com/ultralytics/">
157
+ <img src="https://github.com/ultralytics/yolov5/releases/download/v1.0/logo-social-instagram.png" width="3%"/>
158
+ </a>
159
+ </div>
_requirements.txt ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # pip install -r requirements.txt
2
+ streamlit
3
+ # base ----------------------------------------
4
+ # matplotlib>=3.2.2
5
+ numpy>=1.18.5
6
+ # opencv-python>=4.1.2
7
+ # http://download.pytorch.org/whl/cpu/torch-1.7.1%2Bcpu-cp39-cp39-linux_x86_64.whl
8
+ # gunicorn == 19.9.0
9
+ # torchvision==0.2.2
10
+ opencv-python-headless>=4.1.2
11
+ Pillow>=8.0.0
12
+ PyYAML>=5.3.1
13
+ scipy>=1.4.1
14
+ torch>=1.7.0
15
+ torchvision>=0.8.1
16
+ tqdm>=4.41.0
17
+
18
+ # logging -------------------------------------
19
+ # tensorboard>=2.4.1
20
+ # wandb
21
+
22
+ # plotting ------------------------------------
23
+ # seaborn>=0.11.0
24
+ pandas
25
+
26
+ # export --------------------------------------
27
+ # coremltools>=4.1
28
+ # onnx>=1.9.0
29
+ # scikit-learn==0.19.2 # for coreml quantization
30
+ # tensorflow==2.4.1 # for TFLite export
31
+
32
+ # extras --------------------------------------
33
+ # Cython # for pycocotools https://github.com/cocodataset/cocoapi/issues/172
34
+ # pycocotools>=2.0 # COCO mAP
35
+ # albumentations>=1.0.3
36
+ # thop # FLOPs computation
app.py ADDED
@@ -0,0 +1,296 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # https://planogram-compliance.herokuapp.com/
2
+ # https://dashboard.heroku.com/apps/planogram-compliance/deploy/heroku-git
3
+
4
+ # https://medium.com/@mohcufe/how-to-deploy-your-trained-pytorch-model-on-heroku-ff4b73085ddd\
5
+ # https://stackoverflow.com/questions/51730880/where-do-i-get-a-cpu-only-version-of-pytorch
6
+ # https://blog.jcharistech.com/2020/02/26/how-to-deploy-a-face-detection-streamlit-app-on-heroku/
7
+ # https://towardsdatascience.com/a-quick-tutorial-on-how-to-deploy-your-streamlit-app-to-heroku-
8
+ # https://www.analyticsvidhya.com/blog/2021/06/deploy-your-ml-dl-streamlit-application-on-heroku/
9
+ # https://gist.github.com/jeremyjordan/6b506257509e8ba673f145baa568a1ea
10
+
11
+ import json
12
+
13
+ # https://www.r-bloggers.com/2020/12/creating-a-streamlit-web-app-building-with-docker-github-actions-and-hosting-on-heroku/
14
+ # https://devcenter.heroku.com/articles/container-registry-and-runtime
15
+ # from yolo_inference_util import run_yolo_v5
16
+ import os
17
+ from tempfile import NamedTemporaryFile
18
+
19
+ import cv2
20
+ import numpy as np
21
+ import pandas as pd
22
+ import streamlit as st
23
+
24
+ # import matplotlib.pyplot as plt
25
+ from app_utils import annotate_planogram_compliance, bucket_sort, do_sorting, xml_to_csv
26
+ from inference import run
27
+
28
+ # from utils.plots import Annotator, colors
29
+ # from utils.general import scale_coords
30
+
31
+ app_formal_name = "Planogram Compliance"
32
+
33
+ FILE_UPLOAD_DIR = "tmp"
34
+
35
+ os.makedirs(FILE_UPLOAD_DIR, exist_ok=True)
36
+ # Start the app in wide-mode
37
+ st.set_page_config(
38
+ layout="wide",
39
+ page_title=app_formal_name,
40
+ )
41
+ # https://github.com/streamlit/streamlit/issues/1361
42
+ uploaded_file = st.file_uploader(
43
+ "Choose a planogram image to score",
44
+ type=["jpg", "JPEG", "PNG", "JPG", "jpeg"],
45
+ )
46
+ uploaded_master_planogram_file = st.file_uploader(
47
+ "Upload a master planogram", type=["jpg", "JPEG", "PNG", "JPG", "jpeg"]
48
+ )
49
+ annotation_file = st.file_uploader("upload master polanogram", type=["xml"])
50
+ temp_file = NamedTemporaryFile(delete=False)
51
+
52
+ target_names = [
53
+ "Bottle,100PLUS ACTIVE 1.5L",
54
+ "Bottle,100PLUS ACTIVE 500ML",
55
+ "Bottle,100PLUS LEMON LIME 1.5L",
56
+ "Bottle,100PLUS ORANGE 500ML",
57
+ "Bottle,100PLUS ORIGINAL 1.5L",
58
+ "Bottle,100PLUS TANGY ORANGE 1.5L",
59
+ "Bottle,100PLUS ZERO 1.5L",
60
+ "Bottle,100PLUS ZERO 500ML",
61
+ "Packet,F:M MAGNOLIA CHOC 1L",
62
+ "Bottle,F&N GINGER ADE 1.5L",
63
+ "Bottle,F&N GRAPE 1.5L",
64
+ "Bottle,F&N ICE CREAM SODA 1.5L",
65
+ "Bottle,F&N LYCHEE PEAR 1.5L",
66
+ "Bottle,F&N ORANGE 1.5L",
67
+ "Bottle,F&N PINEAPPLE PET 1.5L",
68
+ "Bottle,F&N SARSI 1.5L",
69
+ "Bottle,F&N SS ICE LEM TEA RS 500ML",
70
+ "Bottle,F&N SS ICE LEMON TEA RS 1.5L",
71
+ "Bottle,F&N SS ICE LEMON TEA 1.5L",
72
+ "Bottle,F&N SS ICE LEMON TEA 500ML",
73
+ "Bottle,F&N SS ICE PEACH TEA 1.5L",
74
+ "Bottle,SS ICE LEMON GT 1.48L",
75
+ "Bottle,SS WHITE CHRYS TEA 1.48L",
76
+ "Packet,FARMHOUSE FRESH MILK 1L FNDM",
77
+ "Packet,FARMHOUSE PLAIN LF 1L",
78
+ "Packet,PURA FRESH MILK 1L FS",
79
+ "Packet,NUTRISOY REG NO SUGAR ADDED 1L",
80
+ "Packet,NUTRISOY PLAIN 475ML",
81
+ "Packet,NUTRISOY PLAIN 1L",
82
+ "Packet,NUTRISOY OMEGA RD SUGAR 1L",
83
+ "Packet,NUTRISOY OMEGA NSA 1L",
84
+ "Packet,NUTRISOY ALMOND 1L",
85
+ "Packet,MAGNOLIA FRESH MILK 1L FNDM",
86
+ "Packet,FM MAG FC PLAIN 200ML",
87
+ "Packet,MAG OMEGA PLUS PLAIN 200ML",
88
+ "Packet,MAG KURMA MILK 500ML",
89
+ "Packet,MAG KURMA MILK 1L",
90
+ "Packet,MAG CHOCOLATE FC 500ML",
91
+ "Packet,MAG BROWN SUGAR SS MILK 1L",
92
+ "Packet,FM MAG LFHC PLN 500ML",
93
+ "Packet,FM MAG LFHC OAT 500ML",
94
+ "Packet,FM MAG LFHC OAT 1L",
95
+ "Packet,FM MAG FC PLAIN 500ML",
96
+ "Void,PARTIAL VOID",
97
+ "Void,FULL VOID",
98
+ "Bottle,F&N SS ICE LEM TEA 500ML",
99
+ ]
100
+
101
+ run_app = st.button("Run the compliance check")
102
+ if run_app and uploaded_file is not None:
103
+ # Convert the file to an opencv image.
104
+ file_bytes = np.asarray(bytearray(uploaded_file.read()), dtype=np.uint8)
105
+ temp_file.write(uploaded_file.getvalue())
106
+ uploaded_img = cv2.imdecode(file_bytes, 1)
107
+ cv2.imwrite("tmp/to_score_planogram_tmp.png", uploaded_img)
108
+
109
+ # if uploaded_master_planogram_file is None:
110
+ # master = cv2.imread('./sample_master_planogram.jpeg')
111
+
112
+ names_dict = {name: id for id, name in enumerate(target_names)}
113
+
114
+ sorted_xml_df = None
115
+ # https://discuss.streamlit.io/t/unable-to-read-files-using-standard-file-uploader/2258/2
116
+ if uploaded_master_planogram_file and annotation_file:
117
+ file_bytes = np.asarray(
118
+ bytearray(uploaded_master_planogram_file.read()), dtype=np.uint8
119
+ )
120
+ master = cv2.imdecode(file_bytes, 1)
121
+ cv2.imwrite("tmp/master_tmp.png", master)
122
+ # cv2.imwrite("tmp_uploaded_master_planogram_img.png", master)
123
+ # xml = annotation_file.read()
124
+ # tmp_xml ="tmp_xml_annotation.xml"
125
+ # with open(tmp_xml ,'w',encoding='utf-8') as f:
126
+ # xml = f.write(xml)
127
+ xml_df = xml_to_csv(annotation_file)
128
+ xml_df["cls"] = xml_df["cls"].map(names_dict)
129
+ sorted_xml_df = do_sorting(xml_df)
130
+ sorted_xml_df.line_number.value_counts()
131
+
132
+ line_data = sorted_xml_df.line_number.value_counts()
133
+ n_rows = int(len(line_data))
134
+ n_cols = int(max(line_data))
135
+ master_table = np.zeros((n_rows, n_cols)) + 101
136
+ master_annotations = []
137
+ for i, row in sorted_xml_df.groupby("line_number"):
138
+ # print(f"Adding products in the row {i} to the detected planogram", row.cls.tolist())
139
+ products = row.cls.tolist()
140
+ master_table[int(i - 1), 0 : len(products)] = products
141
+ annotations = [
142
+ (int(k), int(v))
143
+ for k, v in list(
144
+ zip(row.cls.unique(), row.cls.value_counts().tolist())
145
+ )
146
+ ]
147
+ master_annotations.append(annotations)
148
+ master_table.shape
149
+ # print("Annoatated planogram")
150
+ # print(np.matrix(master_table))
151
+
152
+ elif uploaded_master_planogram_file:
153
+ print(
154
+ "Finding the amster annotations with the YOLOv5 model predictions"
155
+ )
156
+ file_bytes = np.asarray(
157
+ bytearray(uploaded_master_planogram_file.read()), dtype=np.uint8
158
+ )
159
+ master = cv2.imdecode(file_bytes, 1)
160
+ cv2.imwrite("tmp/master_tmp.png", master)
161
+ master_results = run(
162
+ weights="base_line_best_model_exp5.pt",
163
+ source="tmp/master_tmp.png",
164
+ imgsz=[640, 640],
165
+ conf_thres=0.6,
166
+ iou_thres=0.6,
167
+ )
168
+
169
+ bb_df = pd.DataFrame(
170
+ master_results[0][1].tolist(),
171
+ columns=["xmin", "ymin", "xmax", "ymax", "conf", "cls"],
172
+ )
173
+ sorted_df = do_sorting(bb_df)
174
+
175
+ n_rows = int(sorted_df.line_number.max())
176
+ n_cols = int(
177
+ sorted_df.groupby("line_number")
178
+ .size()
179
+ .reset_index(name="counts")["counts"]
180
+ .max()
181
+ )
182
+ non_null_product = 101
183
+ print("master size", n_rows, n_cols)
184
+ master_annotations = []
185
+ master_table = np.zeros((int(n_rows), int(n_cols))) + non_null_product
186
+ for i, row in sorted_df.groupby("line_number"):
187
+ # print(f"Adding products in the row {i} to the detected planogram", row.cls.tolist())
188
+ products = row.cls.tolist()
189
+ col_len = min(len(products), n_cols)
190
+ print("col size: ", col_len)
191
+ print("row size: ", i - 1)
192
+ if n_rows <= (i - 1):
193
+ print("more rows than expected in the predictions")
194
+ break
195
+ master_table[int(i - 1), 0:col_len] = products[:col_len]
196
+ annotations = [
197
+ (int(k), int(v))
198
+ for k, v in list(
199
+ zip(row.cls.unique(), row.cls.value_counts().tolist())
200
+ )
201
+ ]
202
+ master_annotations.append(annotations)
203
+ else:
204
+ master = cv2.imread("./sample_master_planogram.jpeg")
205
+ n_rows = 3
206
+ n_cols = 16
207
+ master_table = np.zeros((n_rows, n_cols)) + 101
208
+ master_annotations = [
209
+ [(32, 12), (8, 4)],
210
+ [(36, 1), (41, 6), (50, 4), (51, 3), (52, 2)],
211
+ [(23, 5), (24, 6), (54, 5)],
212
+ ]
213
+
214
+ for i, row in enumerate(master_annotations):
215
+ idx = 0
216
+ for product, count in row:
217
+ master_table[i, idx : idx + count] = product
218
+ idx = idx + count
219
+ # Now do something with the image! For example, let's display it:
220
+ # st.image(opencv_image, channels="BGR")
221
+
222
+ # uploaded_img = '/content/drive/My Drive/0.CV/0.Planogram_Compliance/planogram_data/images/test/IMG_5718.jpg'
223
+ result_list = run(
224
+ weights="base_line_best_model_exp5.pt",
225
+ source="tmp/to_score_planogram_tmp.png",
226
+ imgsz=[640, 640],
227
+ conf_thres=0.6,
228
+ iou_thres=0.6,
229
+ )
230
+
231
+ bb_df = pd.DataFrame(
232
+ result_list[0][1].tolist(),
233
+ columns=["xmin", "ymin", "xmax", "ymax", "conf", "cls"],
234
+ )
235
+ sorted_df = do_sorting(bb_df)
236
+
237
+ non_null_product = 101
238
+ print("master size", n_rows, n_cols)
239
+ detected_table = np.zeros((n_rows, n_cols)) + non_null_product
240
+ for i, row in sorted_df.groupby("line_number"):
241
+ # print(f"Adding products in the row {i} to the detected planogram", row.cls.tolist())
242
+ products = row.cls.tolist()
243
+ col_len = min(len(products), n_cols)
244
+ print("col size: ", col_len)
245
+ print("row size: ", i - 1)
246
+ if n_rows <= (i - 1):
247
+ print("more rows than expected in the predictions")
248
+ break
249
+ detected_table[int(i - 1), 0:col_len] = products[:col_len]
250
+
251
+ # score = (master_table == detected_table).sum() / (master_table != non_null_product).sum()
252
+ correct_matches = (
253
+ np.ma.masked_equal(master_table, non_null_product) == detected_table
254
+ ).sum()
255
+ total_products = (master_table != non_null_product).sum()
256
+ score = correct_matches / total_products
257
+ # if sorted_xml_df is not None:
258
+ # annotate_df = sorted_xml_df[["xmin","ymin", "xmax", "ymax", "line_number","cls"]].astype(int)
259
+ # else:
260
+ annotate_df = sorted_df[
261
+ ["xmin", "ymin", "xmax", "ymax", "line_number", "cls"]
262
+ ].astype(int)
263
+
264
+ mask = master_table != non_null_product
265
+ m_detected_table = np.ma.masked_array(master_table, mask=mask)
266
+ m_annotated_table = np.ma.masked_array(detected_table, mask=mask)
267
+
268
+ # wrong_indexes = np.ravel_multi_index(master_table*mask != detected_table*mask, master_table.shape)
269
+ wrong_indexes = np.where(master_table != detected_table)
270
+ correct_indexes = np.where(master_table == detected_table)
271
+ annotated_planogram = annotate_planogram_compliance(
272
+ uploaded_img, annotate_df, correct_indexes, wrong_indexes, target_names
273
+ )
274
+ st.title("Target Products")
275
+ st.write(json.dumps(target_names))
276
+ st.title("The master planogram annotation")
277
+ st.write(
278
+ "The annotations are based on the index of products from Target products list "
279
+ )
280
+ st.write(json.dumps(master_annotations))
281
+
282
+ # https://github.com/streamlit/streamlit/issues/888
283
+ st.image(
284
+ [master, annotated_planogram, result_list[0][0]],
285
+ width=512,
286
+ caption=[
287
+ "Master planogram",
288
+ "Planogram Compliance",
289
+ "Planogram Predictions",
290
+ ],
291
+ channels="BGR",
292
+ )
293
+ # st.image([master, annotated_planogram], width=512, caption=["Master planogram", "Planogram Compliance"], channels="BGR")
294
+ st.title("Planogram Compiance score")
295
+ # st.write(f"{correct_matches} / {total_products}")
296
+ st.write(score)
app_test.ipynb ADDED
The diff for this file is too large to render. See raw diff
 
app_utils.py ADDED
@@ -0,0 +1,196 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import glob
2
+ import json
3
+ import os
4
+ import xml.etree.ElementTree as ET
5
+
6
+ import cv2
7
+
8
+ # from sklearn.externals import joblib
9
+ import joblib
10
+ import numpy as np
11
+ import pandas as pd
12
+
13
+ # from .variables import old_ocr_req_cols
14
+ # from .skew_correction import PageSkewWraper
15
+
16
+ const_HW = 1.294117647
17
+ const_W = 600
18
+ # https://www.forbes.com/sites/forbestechcouncil/2020/06/02/leveraging-technologies-to-align-realograms-and-planograms-for-grocery/?sh=506b8b78e86c
19
+
20
+
21
+ # https://stackoverflow.com/questions/39403183/python-opencv-sorting-contours
22
+ # http://devdoc.net/linux/OpenCV-3.2.0/da/d0c/tutorial_bounding_rects_circles.html
23
+ # https://stackoverflow.com/questions/10297713/find-contour-of-the-set-of-points-in-opencv
24
+ # https://stackoverflow.com/questions/16538774/dealing-with-contours-and-bounding-rectangle-in-opencv-2-4-python-2-7
25
+ # https://stackoverflow.com/questions/50308055/creating-bounding-boxes-for-contours
26
+ # https://stackoverflow.com/questions/57296398/how-can-i-get-better-results-of-bounding-box-using-find-contours-of-opencv
27
+ # http://amroamroamro.github.io/mexopencv/opencv/generalContours_demo1.html
28
+ # https://gist.github.com/bigsnarfdude/d811e31ee17495f82f10db12651ae82d
29
+ # http://man.hubwiz.com/docset/OpenCV.docset/Contents/Resources/Documents/da/d0c/tutorial_bounding_rects_circles.html
30
+ # https://www.analyticsvidhya.com/blog/2021/05/document-layout-detection-and-ocr-with-detectron2/
31
+ # https://colab.research.google.com/drive/1m6gaQF6Q4M0IaSjoo_4jWllKJjK-i6fw?usp=sharing#scrollTo=lEyl3wYKHAe1
32
+ # https://stackoverflow.com/questions/39403183/python-opencv-sorting-contours
33
+ # https://docs.opencv.org/2.4/doc/tutorials/imgproc/shapedescriptors/bounding_rects_circles/bounding_rects_circles.html
34
+ # https://www.pyimagesearch.com/2016/03/21/ordering-coordinates-clockwise-with-python-and-opencv/
35
+
36
+
37
+ def bucket_sort(df, colmn, ymax_col="ymax", ymin_col="ymin"):
38
+ df["line_number"] = 0
39
+ colmn.append("line_number")
40
+ array_value = df[colmn].values
41
+ start_index = Line_counter = counter = 0
42
+ ymax, ymin, line_no = (
43
+ colmn.index(ymax_col),
44
+ colmn.index(ymin_col),
45
+ colmn.index("line_number"),
46
+ )
47
+ while counter < len(array_value):
48
+ current_ymax = array_value[start_index][ymax]
49
+ for next_index in range(start_index, len(array_value)):
50
+ counter += 1
51
+
52
+ next_ymin = array_value[next_index][ymin]
53
+ next_ymax = array_value[next_index][ymax]
54
+ if current_ymax > next_ymin:
55
+
56
+ array_value[next_index][line_no] = Line_counter + 1
57
+ # if current_ymax < next_ymax:
58
+
59
+ # current_ymax = next_ymax
60
+ else:
61
+ counter -= 1
62
+ break
63
+ # print(counter, len(array_value), start_index)
64
+ start_index = counter
65
+ Line_counter += 1
66
+ return pd.DataFrame(array_value, columns=colmn)
67
+
68
+
69
+ def do_sorting(df):
70
+ df.sort_values(["ymin", "xmin"], ascending=True, inplace=True)
71
+ df["idx"] = df.index
72
+ if "line_number" in df.columns:
73
+ print("line number removed")
74
+ df.drop("line_number", axis=1, inplace=True)
75
+ req_colns = ["xmin", "ymin", "xmax", "ymax", "idx"]
76
+ temp_df = df.copy()
77
+ temp = bucket_sort(temp_df.copy(), req_colns)
78
+ df = df.merge(temp[["idx", "line_number"]], on="idx")
79
+ df.sort_values(["line_number", "xmin"], ascending=True, inplace=True)
80
+ df = df.reset_index(drop=True)
81
+ df = df.reset_index(drop=True)
82
+ return df
83
+
84
+
85
+ def xml_to_csv(xml_file):
86
+ # https://gist.github.com/rotemtam/88d9a4efae243fc77ed4a0f9917c8f6c
87
+ xml_list = []
88
+ # for xml_file in glob.glob(path + '/*.xml'):
89
+ # https://discuss.streamlit.io/t/unable-to-read-files-using-standard-file-uploader/2258/2
90
+ tree = ET.parse(xml_file)
91
+ root = tree.getroot()
92
+ for member in root.findall("object"):
93
+ bbx = member.find("bndbox")
94
+ xmin = int(bbx.find("xmin").text)
95
+ ymin = int(bbx.find("ymin").text)
96
+ xmax = int(bbx.find("xmax").text)
97
+ ymax = int(bbx.find("ymax").text)
98
+ label = member.find("name").text
99
+
100
+ value = (
101
+ root.find("filename").text,
102
+ int(root.find("size")[0].text),
103
+ int(root.find("size")[1].text),
104
+ label,
105
+ xmin,
106
+ ymin,
107
+ xmax,
108
+ ymax,
109
+ )
110
+ xml_list.append(value)
111
+ column_name = [
112
+ "filename",
113
+ "width",
114
+ "height",
115
+ "cls",
116
+ "xmin",
117
+ "ymin",
118
+ "xmax",
119
+ "ymax",
120
+ ]
121
+ xml_df = pd.DataFrame(xml_list, columns=column_name)
122
+ return xml_df
123
+
124
+
125
+ # def annotate_planogram_compliance(img0, sorted_xml_df, wrong_indexes, target_names):
126
+ # # annotator = Annotator(img0, line_width=3, pil=True)
127
+ # det = sorted_xml_df[['xmin', 'ymin', 'xmax', 'ymax','cls']].values
128
+ # # det[:, :4] = scale_coords((640, 640), det[:, :4], img0.shape).round()
129
+ # for i, (*xyxy, cls) in enumerate(det):
130
+
131
+ # c = int(cls) # integer class
132
+
133
+ # if i in wrong_indexes:
134
+ # # print(xyxy, "Wrong detection", (255, 0, 0))
135
+ # label = "Wrong detection"
136
+ # color = (0,0,255)
137
+ # else:
138
+ # # print(xyxy, label, (0, 255, 0))
139
+ # label = f'{target_names[c]}'
140
+ # color = (0,255, 0)
141
+ # org = (int(xyxy[0]), int(xyxy[1]) )
142
+ # top_left = org
143
+ # bottom_right = (int(xyxy[2]), int(xyxy[3]))
144
+ # # print("#"*50)
145
+ # # print(f"Anooatting cv2 rectangle with shape: { img0.shape}, top left: { top_left}, bottom right: { bottom_right} , color : { color }, thickness: {3}, cv2.LINE_8")
146
+ # # print("#"*50)
147
+ # cv2.rectangle(img0, top_left, bottom_right , color, 3, cv2.LINE_8)
148
+
149
+ # cv2.putText(img0, label, tuple(org), cv2. FONT_HERSHEY_SIMPLEX , 0.5, color)
150
+
151
+ # return img0
152
+
153
+
154
+ def annotate_planogram_compliance(
155
+ img0, sorted_df, correct_indexes, wrong_indexes, target_names
156
+ ):
157
+ # annotator = Annotator(img0, line_width=3, pil=True)
158
+ det = sorted_df[["xmin", "ymin", "xmax", "ymax", "cls"]].values
159
+ # det[:, :4] = scale_coords((640, 640), det[:, :4], img0.shape).round()
160
+ for x, y in zip(*correct_indexes):
161
+ try:
162
+ row = sorted_df[sorted_df["line_number"] == x + 1].iloc[y]
163
+ xyxy = row[["xmin", "ymin", "xmax", "ymax"]].values
164
+ label = f'{target_names[row["cls"]]}'
165
+ color = (0, 255, 0)
166
+ # org = (int(xyxy[0]), int(xyxy[1]) )
167
+ top_left = (int(row["xmin"]), int(row["ymin"]))
168
+ bottom_right = (int(row["xmax"]), int(row["ymax"]))
169
+ cv2.rectangle(img0, top_left, bottom_right, color, 3, cv2.LINE_8)
170
+
171
+ cv2.putText(
172
+ img0, label, top_left, cv2.FONT_HERSHEY_SIMPLEX, 0.5, color
173
+ )
174
+ except Exception as e:
175
+ print("Error: " + str(e))
176
+ continue
177
+
178
+ for x, y in zip(*wrong_indexes):
179
+ try:
180
+ row = sorted_df[sorted_df["line_number"] == x + 1].iloc[y]
181
+ xyxy = row[["xmin", "ymin", "xmax", "ymax"]].values
182
+ label = f'{target_names[row["cls"]]}'
183
+ color = (0, 0, 255)
184
+ # org = (int(xyxy[0]), int(xyxy[1]) )
185
+ top_left = (row["xmin"], row["ymin"])
186
+ bottom_right = (row["xmax"], row["ymax"])
187
+ cv2.rectangle(img0, top_left, bottom_right, color, 3, cv2.LINE_8)
188
+
189
+ cv2.putText(
190
+ img0, label, top_left, cv2.FONT_HERSHEY_SIMPLEX, 0.5, color
191
+ )
192
+ except Exception as e:
193
+ print("Error: " + str(e))
194
+ continue
195
+
196
+ return img0
base_line_best_model_exp5.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c259d5e97010ee1c9775d6d8c3bc8bb73f52a5ad871ca920902f35563f2acb42
3
+ size 14621601
best_sku_model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:46627e4923a4cbb695e2f1da5944ec7e2930acb640b822227aab334bddf1548b
3
+ size 14355573
classify/predict.py ADDED
@@ -0,0 +1,345 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
2
+ """
3
+ Run YOLOv5 classification inference on images, videos, directories, globs, YouTube, webcam, streams, etc.
4
+
5
+ Usage - sources:
6
+ $ python classify/predict.py --weights yolov5s-cls.pt --source 0 # webcam
7
+ img.jpg # image
8
+ vid.mp4 # video
9
+ screen # screenshot
10
+ path/ # directory
11
+ list.txt # list of images
12
+ list.streams # list of streams
13
+ 'path/*.jpg' # glob
14
+ 'https://youtu.be/Zgi9g1ksQHc' # YouTube
15
+ 'rtsp://example.com/media.mp4' # RTSP, RTMP, HTTP stream
16
+
17
+ Usage - formats:
18
+ $ python classify/predict.py --weights yolov5s-cls.pt # PyTorch
19
+ yolov5s-cls.torchscript # TorchScript
20
+ yolov5s-cls.onnx # ONNX Runtime or OpenCV DNN with --dnn
21
+ yolov5s-cls_openvino_model # OpenVINO
22
+ yolov5s-cls.engine # TensorRT
23
+ yolov5s-cls.mlmodel # CoreML (macOS-only)
24
+ yolov5s-cls_saved_model # TensorFlow SavedModel
25
+ yolov5s-cls.pb # TensorFlow GraphDef
26
+ yolov5s-cls.tflite # TensorFlow Lite
27
+ yolov5s-cls_edgetpu.tflite # TensorFlow Edge TPU
28
+ yolov5s-cls_paddle_model # PaddlePaddle
29
+ """
30
+
31
+ import argparse
32
+ import os
33
+ import platform
34
+ import sys
35
+ from pathlib import Path
36
+
37
+ import torch
38
+ import torch.nn.functional as F
39
+
40
+ FILE = Path(__file__).resolve()
41
+ ROOT = FILE.parents[1] # YOLOv5 root directory
42
+ if str(ROOT) not in sys.path:
43
+ sys.path.append(str(ROOT)) # add ROOT to PATH
44
+ ROOT = Path(os.path.relpath(ROOT, Path.cwd())) # relative
45
+
46
+ from models.common import DetectMultiBackend
47
+ from utils.augmentations import classify_transforms
48
+ from utils.dataloaders import (
49
+ IMG_FORMATS,
50
+ VID_FORMATS,
51
+ LoadImages,
52
+ LoadScreenshots,
53
+ LoadStreams,
54
+ )
55
+ from utils.general import (
56
+ LOGGER,
57
+ Profile,
58
+ check_file,
59
+ check_img_size,
60
+ check_imshow,
61
+ check_requirements,
62
+ colorstr,
63
+ cv2,
64
+ increment_path,
65
+ print_args,
66
+ strip_optimizer,
67
+ )
68
+ from utils.plots import Annotator
69
+ from utils.torch_utils import select_device, smart_inference_mode
70
+
71
+
72
+ @smart_inference_mode()
73
+ def run(
74
+ weights=ROOT / "yolov5s-cls.pt", # model.pt path(s)
75
+ source=ROOT / "data/images", # file/dir/URL/glob/screen/0(webcam)
76
+ data=ROOT / "data/coco128.yaml", # dataset.yaml path
77
+ imgsz=(224, 224), # inference size (height, width)
78
+ device="", # cuda device, i.e. 0 or 0,1,2,3 or cpu
79
+ view_img=False, # show results
80
+ save_txt=False, # save results to *.txt
81
+ nosave=False, # do not save images/videos
82
+ augment=False, # augmented inference
83
+ visualize=False, # visualize features
84
+ update=False, # update all models
85
+ project=ROOT / "runs/predict-cls", # save results to project/name
86
+ name="exp", # save results to project/name
87
+ exist_ok=False, # existing project/name ok, do not increment
88
+ half=False, # use FP16 half-precision inference
89
+ dnn=False, # use OpenCV DNN for ONNX inference
90
+ vid_stride=1, # video frame-rate stride
91
+ ):
92
+ source = str(source)
93
+ save_img = not nosave and not source.endswith(
94
+ ".txt"
95
+ ) # save inference images
96
+ is_file = Path(source).suffix[1:] in (IMG_FORMATS + VID_FORMATS)
97
+ is_url = source.lower().startswith(
98
+ ("rtsp://", "rtmp://", "http://", "https://")
99
+ )
100
+ webcam = (
101
+ source.isnumeric()
102
+ or source.endswith(".streams")
103
+ or (is_url and not is_file)
104
+ )
105
+ screenshot = source.lower().startswith("screen")
106
+ if is_url and is_file:
107
+ source = check_file(source) # download
108
+
109
+ # Directories
110
+ save_dir = increment_path(
111
+ Path(project) / name, exist_ok=exist_ok
112
+ ) # increment run
113
+ (save_dir / "labels" if save_txt else save_dir).mkdir(
114
+ parents=True, exist_ok=True
115
+ ) # make dir
116
+
117
+ # Load model
118
+ device = select_device(device)
119
+ model = DetectMultiBackend(
120
+ weights, device=device, dnn=dnn, data=data, fp16=half
121
+ )
122
+ stride, names, pt = model.stride, model.names, model.pt
123
+ imgsz = check_img_size(imgsz, s=stride) # check image size
124
+
125
+ # Dataloader
126
+ bs = 1 # batch_size
127
+ if webcam:
128
+ view_img = check_imshow(warn=True)
129
+ dataset = LoadStreams(
130
+ source,
131
+ img_size=imgsz,
132
+ transforms=classify_transforms(imgsz[0]),
133
+ vid_stride=vid_stride,
134
+ )
135
+ bs = len(dataset)
136
+ elif screenshot:
137
+ dataset = LoadScreenshots(
138
+ source, img_size=imgsz, stride=stride, auto=pt
139
+ )
140
+ else:
141
+ dataset = LoadImages(
142
+ source,
143
+ img_size=imgsz,
144
+ transforms=classify_transforms(imgsz[0]),
145
+ vid_stride=vid_stride,
146
+ )
147
+ vid_path, vid_writer = [None] * bs, [None] * bs
148
+
149
+ # Run inference
150
+ model.warmup(imgsz=(1 if pt else bs, 3, *imgsz)) # warmup
151
+ seen, windows, dt = 0, [], (Profile(), Profile(), Profile())
152
+ for path, im, im0s, vid_cap, s in dataset:
153
+ with dt[0]:
154
+ im = torch.Tensor(im).to(model.device)
155
+ im = im.half() if model.fp16 else im.float() # uint8 to fp16/32
156
+ if len(im.shape) == 3:
157
+ im = im[None] # expand for batch dim
158
+
159
+ # Inference
160
+ with dt[1]:
161
+ results = model(im)
162
+
163
+ # Post-process
164
+ with dt[2]:
165
+ pred = F.softmax(results, dim=1) # probabilities
166
+
167
+ # Process predictions
168
+ for i, prob in enumerate(pred): # per image
169
+ seen += 1
170
+ if webcam: # batch_size >= 1
171
+ p, im0, frame = path[i], im0s[i].copy(), dataset.count
172
+ s += f"{i}: "
173
+ else:
174
+ p, im0, frame = path, im0s.copy(), getattr(dataset, "frame", 0)
175
+
176
+ p = Path(p) # to Path
177
+ save_path = str(save_dir / p.name) # im.jpg
178
+ txt_path = str(save_dir / "labels" / p.stem) + (
179
+ "" if dataset.mode == "image" else f"_{frame}"
180
+ ) # im.txt
181
+
182
+ s += "%gx%g " % im.shape[2:] # print string
183
+ annotator = Annotator(im0, example=str(names), pil=True)
184
+
185
+ # Print results
186
+ top5i = prob.argsort(0, descending=True)[
187
+ :5
188
+ ].tolist() # top 5 indices
189
+ s += f"{', '.join(f'{names[j]} {prob[j]:.2f}' for j in top5i)}, "
190
+
191
+ # Write results
192
+ text = "\n".join(f"{prob[j]:.2f} {names[j]}" for j in top5i)
193
+ if save_img or view_img: # Add bbox to image
194
+ annotator.text((32, 32), text, txt_color=(255, 255, 255))
195
+ if save_txt: # Write to file
196
+ with open(f"{txt_path}.txt", "a") as f:
197
+ f.write(text + "\n")
198
+
199
+ # Stream results
200
+ im0 = annotator.result()
201
+ if view_img:
202
+ if platform.system() == "Linux" and p not in windows:
203
+ windows.append(p)
204
+ cv2.namedWindow(
205
+ str(p), cv2.WINDOW_NORMAL | cv2.WINDOW_KEEPRATIO
206
+ ) # allow window resize (Linux)
207
+ cv2.resizeWindow(str(p), im0.shape[1], im0.shape[0])
208
+ cv2.imshow(str(p), im0)
209
+ cv2.waitKey(1) # 1 millisecond
210
+
211
+ # Save results (image with detections)
212
+ if save_img:
213
+ if dataset.mode == "image":
214
+ cv2.imwrite(save_path, im0)
215
+ else: # 'video' or 'stream'
216
+ if vid_path[i] != save_path: # new video
217
+ vid_path[i] = save_path
218
+ if isinstance(vid_writer[i], cv2.VideoWriter):
219
+ vid_writer[
220
+ i
221
+ ].release() # release previous video writer
222
+ if vid_cap: # video
223
+ fps = vid_cap.get(cv2.CAP_PROP_FPS)
224
+ w = int(vid_cap.get(cv2.CAP_PROP_FRAME_WIDTH))
225
+ h = int(vid_cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
226
+ else: # stream
227
+ fps, w, h = 30, im0.shape[1], im0.shape[0]
228
+ save_path = str(
229
+ Path(save_path).with_suffix(".mp4")
230
+ ) # force *.mp4 suffix on results videos
231
+ vid_writer[i] = cv2.VideoWriter(
232
+ save_path,
233
+ cv2.VideoWriter_fourcc(*"mp4v"),
234
+ fps,
235
+ (w, h),
236
+ )
237
+ vid_writer[i].write(im0)
238
+
239
+ # Print time (inference-only)
240
+ LOGGER.info(f"{s}{dt[1].dt * 1E3:.1f}ms")
241
+
242
+ # Print results
243
+ t = tuple(x.t / seen * 1e3 for x in dt) # speeds per image
244
+ LOGGER.info(
245
+ f"Speed: %.1fms pre-process, %.1fms inference, %.1fms NMS per image at shape {(1, 3, *imgsz)}"
246
+ % t
247
+ )
248
+ if save_txt or save_img:
249
+ s = (
250
+ f"\n{len(list(save_dir.glob('labels/*.txt')))} labels saved to {save_dir / 'labels'}"
251
+ if save_txt
252
+ else ""
253
+ )
254
+ LOGGER.info(f"Results saved to {colorstr('bold', save_dir)}{s}")
255
+ if update:
256
+ strip_optimizer(
257
+ weights[0]
258
+ ) # update model (to fix SourceChangeWarning)
259
+
260
+
261
+ def parse_opt():
262
+ parser = argparse.ArgumentParser()
263
+ parser.add_argument(
264
+ "--weights",
265
+ nargs="+",
266
+ type=str,
267
+ default=ROOT / "yolov5s-cls.pt",
268
+ help="model path(s)",
269
+ )
270
+ parser.add_argument(
271
+ "--source",
272
+ type=str,
273
+ default=ROOT / "data/images",
274
+ help="file/dir/URL/glob/screen/0(webcam)",
275
+ )
276
+ parser.add_argument(
277
+ "--data",
278
+ type=str,
279
+ default=ROOT / "data/coco128.yaml",
280
+ help="(optional) dataset.yaml path",
281
+ )
282
+ parser.add_argument(
283
+ "--imgsz",
284
+ "--img",
285
+ "--img-size",
286
+ nargs="+",
287
+ type=int,
288
+ default=[224],
289
+ help="inference size h,w",
290
+ )
291
+ parser.add_argument(
292
+ "--device", default="", help="cuda device, i.e. 0 or 0,1,2,3 or cpu"
293
+ )
294
+ parser.add_argument("--view-img", action="store_true", help="show results")
295
+ parser.add_argument(
296
+ "--save-txt", action="store_true", help="save results to *.txt"
297
+ )
298
+ parser.add_argument(
299
+ "--nosave", action="store_true", help="do not save images/videos"
300
+ )
301
+ parser.add_argument(
302
+ "--augment", action="store_true", help="augmented inference"
303
+ )
304
+ parser.add_argument(
305
+ "--visualize", action="store_true", help="visualize features"
306
+ )
307
+ parser.add_argument(
308
+ "--update", action="store_true", help="update all models"
309
+ )
310
+ parser.add_argument(
311
+ "--project",
312
+ default=ROOT / "runs/predict-cls",
313
+ help="save results to project/name",
314
+ )
315
+ parser.add_argument(
316
+ "--name", default="exp", help="save results to project/name"
317
+ )
318
+ parser.add_argument(
319
+ "--exist-ok",
320
+ action="store_true",
321
+ help="existing project/name ok, do not increment",
322
+ )
323
+ parser.add_argument(
324
+ "--half", action="store_true", help="use FP16 half-precision inference"
325
+ )
326
+ parser.add_argument(
327
+ "--dnn", action="store_true", help="use OpenCV DNN for ONNX inference"
328
+ )
329
+ parser.add_argument(
330
+ "--vid-stride", type=int, default=1, help="video frame-rate stride"
331
+ )
332
+ opt = parser.parse_args()
333
+ opt.imgsz *= 2 if len(opt.imgsz) == 1 else 1 # expand
334
+ print_args(vars(opt))
335
+ return opt
336
+
337
+
338
+ def main(opt):
339
+ check_requirements(exclude=("tensorboard", "thop"))
340
+ run(**vars(opt))
341
+
342
+
343
+ if __name__ == "__main__":
344
+ opt = parse_opt()
345
+ main(opt)
classify/train.py ADDED
@@ -0,0 +1,537 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
2
+ """
3
+ Train a YOLOv5 classifier model on a classification dataset
4
+
5
+ Usage - Single-GPU training:
6
+ $ python classify/train.py --model yolov5s-cls.pt --data imagenette160 --epochs 5 --img 224
7
+
8
+ Usage - Multi-GPU DDP training:
9
+ $ python -m torch.distributed.run --nproc_per_node 4 --master_port 2022 classify/train.py --model yolov5s-cls.pt --data imagenet --epochs 5 --img 224 --device 0,1,2,3
10
+
11
+ Datasets: --data mnist, fashion-mnist, cifar10, cifar100, imagenette, imagewoof, imagenet, or 'path/to/data'
12
+ YOLOv5-cls models: --model yolov5n-cls.pt, yolov5s-cls.pt, yolov5m-cls.pt, yolov5l-cls.pt, yolov5x-cls.pt
13
+ Torchvision models: --model resnet50, efficientnet_b0, etc. See https://pytorch.org/vision/stable/models.html
14
+ """
15
+
16
+ import argparse
17
+ import os
18
+ import subprocess
19
+ import sys
20
+ import time
21
+ from copy import deepcopy
22
+ from datetime import datetime
23
+ from pathlib import Path
24
+
25
+ import torch
26
+ import torch.distributed as dist
27
+ import torch.hub as hub
28
+ import torch.optim.lr_scheduler as lr_scheduler
29
+ import torchvision
30
+ from torch.cuda import amp
31
+ from tqdm import tqdm
32
+
33
+ FILE = Path(__file__).resolve()
34
+ ROOT = FILE.parents[1] # YOLOv5 root directory
35
+ if str(ROOT) not in sys.path:
36
+ sys.path.append(str(ROOT)) # add ROOT to PATH
37
+ ROOT = Path(os.path.relpath(ROOT, Path.cwd())) # relative
38
+
39
+ from classify import val as validate
40
+ from models.experimental import attempt_load
41
+ from models.yolo import ClassificationModel, DetectionModel
42
+ from utils.dataloaders import create_classification_dataloader
43
+ from utils.general import (
44
+ DATASETS_DIR,
45
+ LOGGER,
46
+ TQDM_BAR_FORMAT,
47
+ WorkingDirectory,
48
+ check_git_info,
49
+ check_git_status,
50
+ check_requirements,
51
+ colorstr,
52
+ download,
53
+ increment_path,
54
+ init_seeds,
55
+ print_args,
56
+ yaml_save,
57
+ )
58
+ from utils.loggers import GenericLogger
59
+ from utils.plots import imshow_cls
60
+ from utils.torch_utils import (
61
+ ModelEMA,
62
+ model_info,
63
+ reshape_classifier_output,
64
+ select_device,
65
+ smart_DDP,
66
+ smart_optimizer,
67
+ smartCrossEntropyLoss,
68
+ torch_distributed_zero_first,
69
+ )
70
+
71
+ LOCAL_RANK = int(
72
+ os.getenv("LOCAL_RANK", -1)
73
+ ) # https://pytorch.org/docs/stable/elastic/run.html
74
+ RANK = int(os.getenv("RANK", -1))
75
+ WORLD_SIZE = int(os.getenv("WORLD_SIZE", 1))
76
+ GIT_INFO = check_git_info()
77
+
78
+
79
+ def train(opt, device):
80
+ init_seeds(opt.seed + 1 + RANK, deterministic=True)
81
+ save_dir, data, bs, epochs, nw, imgsz, pretrained = (
82
+ opt.save_dir,
83
+ Path(opt.data),
84
+ opt.batch_size,
85
+ opt.epochs,
86
+ min(os.cpu_count() - 1, opt.workers),
87
+ opt.imgsz,
88
+ str(opt.pretrained).lower() == "true",
89
+ )
90
+ cuda = device.type != "cpu"
91
+
92
+ # Directories
93
+ wdir = save_dir / "weights"
94
+ wdir.mkdir(parents=True, exist_ok=True) # make dir
95
+ last, best = wdir / "last.pt", wdir / "best.pt"
96
+
97
+ # Save run settings
98
+ yaml_save(save_dir / "opt.yaml", vars(opt))
99
+
100
+ # Logger
101
+ logger = (
102
+ GenericLogger(opt=opt, console_logger=LOGGER)
103
+ if RANK in {-1, 0}
104
+ else None
105
+ )
106
+
107
+ # Download Dataset
108
+ with torch_distributed_zero_first(LOCAL_RANK), WorkingDirectory(ROOT):
109
+ data_dir = data if data.is_dir() else (DATASETS_DIR / data)
110
+ if not data_dir.is_dir():
111
+ LOGGER.info(
112
+ f"\nDataset not found ⚠️, missing path {data_dir}, attempting download..."
113
+ )
114
+ t = time.time()
115
+ if str(data) == "imagenet":
116
+ subprocess.run(
117
+ f"bash {ROOT / 'data/scripts/get_imagenet.sh'}",
118
+ shell=True,
119
+ check=True,
120
+ )
121
+ else:
122
+ url = f"https://github.com/ultralytics/yolov5/releases/download/v1.0/{data}.zip"
123
+ download(url, dir=data_dir.parent)
124
+ s = f"Dataset download success ✅ ({time.time() - t:.1f}s), saved to {colorstr('bold', data_dir)}\n"
125
+ LOGGER.info(s)
126
+
127
+ # Dataloaders
128
+ nc = len(
129
+ [x for x in (data_dir / "train").glob("*") if x.is_dir()]
130
+ ) # number of classes
131
+ trainloader = create_classification_dataloader(
132
+ path=data_dir / "train",
133
+ imgsz=imgsz,
134
+ batch_size=bs // WORLD_SIZE,
135
+ augment=True,
136
+ cache=opt.cache,
137
+ rank=LOCAL_RANK,
138
+ workers=nw,
139
+ )
140
+
141
+ test_dir = (
142
+ data_dir / "test" if (data_dir / "test").exists() else data_dir / "val"
143
+ ) # data/test or data/val
144
+ if RANK in {-1, 0}:
145
+ testloader = create_classification_dataloader(
146
+ path=test_dir,
147
+ imgsz=imgsz,
148
+ batch_size=bs // WORLD_SIZE * 2,
149
+ augment=False,
150
+ cache=opt.cache,
151
+ rank=-1,
152
+ workers=nw,
153
+ )
154
+
155
+ # Model
156
+ with torch_distributed_zero_first(LOCAL_RANK), WorkingDirectory(ROOT):
157
+ if Path(opt.model).is_file() or opt.model.endswith(".pt"):
158
+ model = attempt_load(opt.model, device="cpu", fuse=False)
159
+ elif (
160
+ opt.model in torchvision.models.__dict__
161
+ ): # TorchVision models i.e. resnet50, efficientnet_b0
162
+ model = torchvision.models.__dict__[opt.model](
163
+ weights="IMAGENET1K_V1" if pretrained else None
164
+ )
165
+ else:
166
+ m = hub.list(
167
+ "ultralytics/yolov5"
168
+ ) # + hub.list('pytorch/vision') # models
169
+ raise ModuleNotFoundError(
170
+ f"--model {opt.model} not found. Available models are: \n"
171
+ + "\n".join(m)
172
+ )
173
+ if isinstance(model, DetectionModel):
174
+ LOGGER.warning(
175
+ "WARNING ⚠️ pass YOLOv5 classifier model with '-cls' suffix, i.e. '--model yolov5s-cls.pt'"
176
+ )
177
+ model = ClassificationModel(
178
+ model=model, nc=nc, cutoff=opt.cutoff or 10
179
+ ) # convert to classification model
180
+ reshape_classifier_output(model, nc) # update class count
181
+ for m in model.modules():
182
+ if not pretrained and hasattr(m, "reset_parameters"):
183
+ m.reset_parameters()
184
+ if isinstance(m, torch.nn.Dropout) and opt.dropout is not None:
185
+ m.p = opt.dropout # set dropout
186
+ for p in model.parameters():
187
+ p.requires_grad = True # for training
188
+ model = model.to(device)
189
+
190
+ # Info
191
+ if RANK in {-1, 0}:
192
+ model.names = trainloader.dataset.classes # attach class names
193
+ model.transforms = (
194
+ testloader.dataset.torch_transforms
195
+ ) # attach inference transforms
196
+ model_info(model)
197
+ if opt.verbose:
198
+ LOGGER.info(model)
199
+ images, labels = next(iter(trainloader))
200
+ file = imshow_cls(
201
+ images[:25],
202
+ labels[:25],
203
+ names=model.names,
204
+ f=save_dir / "train_images.jpg",
205
+ )
206
+ logger.log_images(file, name="Train Examples")
207
+ logger.log_graph(model, imgsz) # log model
208
+
209
+ # Optimizer
210
+ optimizer = smart_optimizer(
211
+ model, opt.optimizer, opt.lr0, momentum=0.9, decay=opt.decay
212
+ )
213
+
214
+ # Scheduler
215
+ lrf = 0.01 # final lr (fraction of lr0)
216
+ # lf = lambda x: ((1 + math.cos(x * math.pi / epochs)) / 2) * (1 - lrf) + lrf # cosine
217
+ lf = lambda x: (1 - x / epochs) * (1 - lrf) + lrf # linear
218
+ scheduler = lr_scheduler.LambdaLR(optimizer, lr_lambda=lf)
219
+ # scheduler = lr_scheduler.OneCycleLR(optimizer, max_lr=lr0, total_steps=epochs, pct_start=0.1,
220
+ # final_div_factor=1 / 25 / lrf)
221
+
222
+ # EMA
223
+ ema = ModelEMA(model) if RANK in {-1, 0} else None
224
+
225
+ # DDP mode
226
+ if cuda and RANK != -1:
227
+ model = smart_DDP(model)
228
+
229
+ # Train
230
+ t0 = time.time()
231
+ criterion = smartCrossEntropyLoss(
232
+ label_smoothing=opt.label_smoothing
233
+ ) # loss function
234
+ best_fitness = 0.0
235
+ scaler = amp.GradScaler(enabled=cuda)
236
+ val = test_dir.stem # 'val' or 'test'
237
+ LOGGER.info(
238
+ f"Image sizes {imgsz} train, {imgsz} test\n"
239
+ f"Using {nw * WORLD_SIZE} dataloader workers\n"
240
+ f"Logging results to {colorstr('bold', save_dir)}\n"
241
+ f"Starting {opt.model} training on {data} dataset with {nc} classes for {epochs} epochs...\n\n"
242
+ f"{'Epoch':>10}{'GPU_mem':>10}{'train_loss':>12}{f'{val}_loss':>12}{'top1_acc':>12}{'top5_acc':>12}"
243
+ )
244
+ for epoch in range(epochs): # loop over the dataset multiple times
245
+ tloss, vloss, fitness = 0.0, 0.0, 0.0 # train loss, val loss, fitness
246
+ model.train()
247
+ if RANK != -1:
248
+ trainloader.sampler.set_epoch(epoch)
249
+ pbar = enumerate(trainloader)
250
+ if RANK in {-1, 0}:
251
+ pbar = tqdm(
252
+ enumerate(trainloader),
253
+ total=len(trainloader),
254
+ bar_format=TQDM_BAR_FORMAT,
255
+ )
256
+ for i, (images, labels) in pbar: # progress bar
257
+ images, labels = images.to(device, non_blocking=True), labels.to(
258
+ device
259
+ )
260
+
261
+ # Forward
262
+ with amp.autocast(enabled=cuda): # stability issues when enabled
263
+ loss = criterion(model(images), labels)
264
+
265
+ # Backward
266
+ scaler.scale(loss).backward()
267
+
268
+ # Optimize
269
+ scaler.unscale_(optimizer) # unscale gradients
270
+ torch.nn.utils.clip_grad_norm_(
271
+ model.parameters(), max_norm=10.0
272
+ ) # clip gradients
273
+ scaler.step(optimizer)
274
+ scaler.update()
275
+ optimizer.zero_grad()
276
+ if ema:
277
+ ema.update(model)
278
+
279
+ if RANK in {-1, 0}:
280
+ # Print
281
+ tloss = (tloss * i + loss.item()) / (
282
+ i + 1
283
+ ) # update mean losses
284
+ mem = "%.3gG" % (
285
+ torch.cuda.memory_reserved() / 1e9
286
+ if torch.cuda.is_available()
287
+ else 0
288
+ ) # (GB)
289
+ pbar.desc = (
290
+ f"{f'{epoch + 1}/{epochs}':>10}{mem:>10}{tloss:>12.3g}"
291
+ + " " * 36
292
+ )
293
+
294
+ # Test
295
+ if i == len(pbar) - 1: # last batch
296
+ top1, top5, vloss = validate.run(
297
+ model=ema.ema,
298
+ dataloader=testloader,
299
+ criterion=criterion,
300
+ pbar=pbar,
301
+ ) # test accuracy, loss
302
+ fitness = top1 # define fitness as top1 accuracy
303
+
304
+ # Scheduler
305
+ scheduler.step()
306
+
307
+ # Log metrics
308
+ if RANK in {-1, 0}:
309
+ # Best fitness
310
+ if fitness > best_fitness:
311
+ best_fitness = fitness
312
+
313
+ # Log
314
+ metrics = {
315
+ "train/loss": tloss,
316
+ f"{val}/loss": vloss,
317
+ "metrics/accuracy_top1": top1,
318
+ "metrics/accuracy_top5": top5,
319
+ "lr/0": optimizer.param_groups[0]["lr"],
320
+ } # learning rate
321
+ logger.log_metrics(metrics, epoch)
322
+
323
+ # Save model
324
+ final_epoch = epoch + 1 == epochs
325
+ if (not opt.nosave) or final_epoch:
326
+ ckpt = {
327
+ "epoch": epoch,
328
+ "best_fitness": best_fitness,
329
+ "model": deepcopy(
330
+ ema.ema
331
+ ).half(), # deepcopy(de_parallel(model)).half(),
332
+ "ema": None, # deepcopy(ema.ema).half(),
333
+ "updates": ema.updates,
334
+ "optimizer": None, # optimizer.state_dict(),
335
+ "opt": vars(opt),
336
+ "git": GIT_INFO, # {remote, branch, commit} if a git repo
337
+ "date": datetime.now().isoformat(),
338
+ }
339
+
340
+ # Save last, best and delete
341
+ torch.save(ckpt, last)
342
+ if best_fitness == fitness:
343
+ torch.save(ckpt, best)
344
+ del ckpt
345
+
346
+ # Train complete
347
+ if RANK in {-1, 0} and final_epoch:
348
+ LOGGER.info(
349
+ f"\nTraining complete ({(time.time() - t0) / 3600:.3f} hours)"
350
+ f"\nResults saved to {colorstr('bold', save_dir)}"
351
+ f"\nPredict: python classify/predict.py --weights {best} --source im.jpg"
352
+ f"\nValidate: python classify/val.py --weights {best} --data {data_dir}"
353
+ f"\nExport: python export.py --weights {best} --include onnx"
354
+ f"\nPyTorch Hub: model = torch.hub.load('ultralytics/yolov5', 'custom', '{best}')"
355
+ f"\nVisualize: https://netron.app\n"
356
+ )
357
+
358
+ # Plot examples
359
+ images, labels = (
360
+ x[:25] for x in next(iter(testloader))
361
+ ) # first 25 images and labels
362
+ pred = torch.max(ema.ema(images.to(device)), 1)[1]
363
+ file = imshow_cls(
364
+ images,
365
+ labels,
366
+ pred,
367
+ model.names,
368
+ verbose=False,
369
+ f=save_dir / "test_images.jpg",
370
+ )
371
+
372
+ # Log results
373
+ meta = {
374
+ "epochs": epochs,
375
+ "top1_acc": best_fitness,
376
+ "date": datetime.now().isoformat(),
377
+ }
378
+ logger.log_images(
379
+ file, name="Test Examples (true-predicted)", epoch=epoch
380
+ )
381
+ logger.log_model(best, epochs, metadata=meta)
382
+
383
+
384
+ def parse_opt(known=False):
385
+ parser = argparse.ArgumentParser()
386
+ parser.add_argument(
387
+ "--model",
388
+ type=str,
389
+ default="yolov5s-cls.pt",
390
+ help="initial weights path",
391
+ )
392
+ parser.add_argument(
393
+ "--data",
394
+ type=str,
395
+ default="imagenette160",
396
+ help="cifar10, cifar100, mnist, imagenet, ...",
397
+ )
398
+ parser.add_argument(
399
+ "--epochs", type=int, default=10, help="total training epochs"
400
+ )
401
+ parser.add_argument(
402
+ "--batch-size",
403
+ type=int,
404
+ default=64,
405
+ help="total batch size for all GPUs",
406
+ )
407
+ parser.add_argument(
408
+ "--imgsz",
409
+ "--img",
410
+ "--img-size",
411
+ type=int,
412
+ default=224,
413
+ help="train, val image size (pixels)",
414
+ )
415
+ parser.add_argument(
416
+ "--nosave", action="store_true", help="only save final checkpoint"
417
+ )
418
+ parser.add_argument(
419
+ "--cache",
420
+ type=str,
421
+ nargs="?",
422
+ const="ram",
423
+ help='--cache images in "ram" (default) or "disk"',
424
+ )
425
+ parser.add_argument(
426
+ "--device", default="", help="cuda device, i.e. 0 or 0,1,2,3 or cpu"
427
+ )
428
+ parser.add_argument(
429
+ "--workers",
430
+ type=int,
431
+ default=8,
432
+ help="max dataloader workers (per RANK in DDP mode)",
433
+ )
434
+ parser.add_argument(
435
+ "--project",
436
+ default=ROOT / "runs/train-cls",
437
+ help="save to project/name",
438
+ )
439
+ parser.add_argument("--name", default="exp", help="save to project/name")
440
+ parser.add_argument(
441
+ "--exist-ok",
442
+ action="store_true",
443
+ help="existing project/name ok, do not increment",
444
+ )
445
+ parser.add_argument(
446
+ "--pretrained",
447
+ nargs="?",
448
+ const=True,
449
+ default=True,
450
+ help="start from i.e. --pretrained False",
451
+ )
452
+ parser.add_argument(
453
+ "--optimizer",
454
+ choices=["SGD", "Adam", "AdamW", "RMSProp"],
455
+ default="Adam",
456
+ help="optimizer",
457
+ )
458
+ parser.add_argument(
459
+ "--lr0", type=float, default=0.001, help="initial learning rate"
460
+ )
461
+ parser.add_argument(
462
+ "--decay", type=float, default=5e-5, help="weight decay"
463
+ )
464
+ parser.add_argument(
465
+ "--label-smoothing",
466
+ type=float,
467
+ default=0.1,
468
+ help="Label smoothing epsilon",
469
+ )
470
+ parser.add_argument(
471
+ "--cutoff",
472
+ type=int,
473
+ default=None,
474
+ help="Model layer cutoff index for Classify() head",
475
+ )
476
+ parser.add_argument(
477
+ "--dropout", type=float, default=None, help="Dropout (fraction)"
478
+ )
479
+ parser.add_argument("--verbose", action="store_true", help="Verbose mode")
480
+ parser.add_argument(
481
+ "--seed", type=int, default=0, help="Global training seed"
482
+ )
483
+ parser.add_argument(
484
+ "--local_rank",
485
+ type=int,
486
+ default=-1,
487
+ help="Automatic DDP Multi-GPU argument, do not modify",
488
+ )
489
+ return parser.parse_known_args()[0] if known else parser.parse_args()
490
+
491
+
492
+ def main(opt):
493
+ # Checks
494
+ if RANK in {-1, 0}:
495
+ print_args(vars(opt))
496
+ check_git_status()
497
+ check_requirements()
498
+
499
+ # DDP mode
500
+ device = select_device(opt.device, batch_size=opt.batch_size)
501
+ if LOCAL_RANK != -1:
502
+ assert (
503
+ opt.batch_size != -1
504
+ ), "AutoBatch is coming soon for classification, please pass a valid --batch-size"
505
+ assert (
506
+ opt.batch_size % WORLD_SIZE == 0
507
+ ), f"--batch-size {opt.batch_size} must be multiple of WORLD_SIZE"
508
+ assert (
509
+ torch.cuda.device_count() > LOCAL_RANK
510
+ ), "insufficient CUDA devices for DDP command"
511
+ torch.cuda.set_device(LOCAL_RANK)
512
+ device = torch.device("cuda", LOCAL_RANK)
513
+ dist.init_process_group(
514
+ backend="nccl" if dist.is_nccl_available() else "gloo"
515
+ )
516
+
517
+ # Parameters
518
+ opt.save_dir = increment_path(
519
+ Path(opt.project) / opt.name, exist_ok=opt.exist_ok
520
+ ) # increment run
521
+
522
+ # Train
523
+ train(opt, device)
524
+
525
+
526
+ def run(**kwargs):
527
+ # Usage: from yolov5 import classify; classify.train.run(data=mnist, imgsz=320, model='yolov5m')
528
+ opt = parse_opt(True)
529
+ for k, v in kwargs.items():
530
+ setattr(opt, k, v)
531
+ main(opt)
532
+ return opt
533
+
534
+
535
+ if __name__ == "__main__":
536
+ opt = parse_opt()
537
+ main(opt)
classify/tutorial.ipynb ADDED
The diff for this file is too large to render. See raw diff
 
classify/val.py ADDED
@@ -0,0 +1,259 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
2
+ """
3
+ Validate a trained YOLOv5 classification model on a classification dataset
4
+
5
+ Usage:
6
+ $ bash data/scripts/get_imagenet.sh --val # download ImageNet val split (6.3G, 50000 images)
7
+ $ python classify/val.py --weights yolov5m-cls.pt --data ../datasets/imagenet --img 224 # validate ImageNet
8
+
9
+ Usage - formats:
10
+ $ python classify/val.py --weights yolov5s-cls.pt # PyTorch
11
+ yolov5s-cls.torchscript # TorchScript
12
+ yolov5s-cls.onnx # ONNX Runtime or OpenCV DNN with --dnn
13
+ yolov5s-cls_openvino_model # OpenVINO
14
+ yolov5s-cls.engine # TensorRT
15
+ yolov5s-cls.mlmodel # CoreML (macOS-only)
16
+ yolov5s-cls_saved_model # TensorFlow SavedModel
17
+ yolov5s-cls.pb # TensorFlow GraphDef
18
+ yolov5s-cls.tflite # TensorFlow Lite
19
+ yolov5s-cls_edgetpu.tflite # TensorFlow Edge TPU
20
+ yolov5s-cls_paddle_model # PaddlePaddle
21
+ """
22
+
23
+ import argparse
24
+ import os
25
+ import sys
26
+ from pathlib import Path
27
+
28
+ import torch
29
+ from tqdm import tqdm
30
+
31
+ FILE = Path(__file__).resolve()
32
+ ROOT = FILE.parents[1] # YOLOv5 root directory
33
+ if str(ROOT) not in sys.path:
34
+ sys.path.append(str(ROOT)) # add ROOT to PATH
35
+ ROOT = Path(os.path.relpath(ROOT, Path.cwd())) # relative
36
+
37
+ from models.common import DetectMultiBackend
38
+ from utils.dataloaders import create_classification_dataloader
39
+ from utils.general import (
40
+ LOGGER,
41
+ TQDM_BAR_FORMAT,
42
+ Profile,
43
+ check_img_size,
44
+ check_requirements,
45
+ colorstr,
46
+ increment_path,
47
+ print_args,
48
+ )
49
+ from utils.torch_utils import select_device, smart_inference_mode
50
+
51
+
52
+ @smart_inference_mode()
53
+ def run(
54
+ data=ROOT / "../datasets/mnist", # dataset dir
55
+ weights=ROOT / "yolov5s-cls.pt", # model.pt path(s)
56
+ batch_size=128, # batch size
57
+ imgsz=224, # inference size (pixels)
58
+ device="", # cuda device, i.e. 0 or 0,1,2,3 or cpu
59
+ workers=8, # max dataloader workers (per RANK in DDP mode)
60
+ verbose=False, # verbose output
61
+ project=ROOT / "runs/val-cls", # save to project/name
62
+ name="exp", # save to project/name
63
+ exist_ok=False, # existing project/name ok, do not increment
64
+ half=False, # use FP16 half-precision inference
65
+ dnn=False, # use OpenCV DNN for ONNX inference
66
+ model=None,
67
+ dataloader=None,
68
+ criterion=None,
69
+ pbar=None,
70
+ ):
71
+ # Initialize/load model and set device
72
+ training = model is not None
73
+ if training: # called by train.py
74
+ device, pt, jit, engine = (
75
+ next(model.parameters()).device,
76
+ True,
77
+ False,
78
+ False,
79
+ ) # get model device, PyTorch model
80
+ half &= device.type != "cpu" # half precision only supported on CUDA
81
+ model.half() if half else model.float()
82
+ else: # called directly
83
+ device = select_device(device, batch_size=batch_size)
84
+
85
+ # Directories
86
+ save_dir = increment_path(
87
+ Path(project) / name, exist_ok=exist_ok
88
+ ) # increment run
89
+ save_dir.mkdir(parents=True, exist_ok=True) # make dir
90
+
91
+ # Load model
92
+ model = DetectMultiBackend(weights, device=device, dnn=dnn, fp16=half)
93
+ stride, pt, jit, engine = (
94
+ model.stride,
95
+ model.pt,
96
+ model.jit,
97
+ model.engine,
98
+ )
99
+ imgsz = check_img_size(imgsz, s=stride) # check image size
100
+ half = model.fp16 # FP16 supported on limited backends with CUDA
101
+ if engine:
102
+ batch_size = model.batch_size
103
+ else:
104
+ device = model.device
105
+ if not (pt or jit):
106
+ batch_size = 1 # export.py models default to batch-size 1
107
+ LOGGER.info(
108
+ f"Forcing --batch-size 1 square inference (1,3,{imgsz},{imgsz}) for non-PyTorch models"
109
+ )
110
+
111
+ # Dataloader
112
+ data = Path(data)
113
+ test_dir = (
114
+ data / "test" if (data / "test").exists() else data / "val"
115
+ ) # data/test or data/val
116
+ dataloader = create_classification_dataloader(
117
+ path=test_dir,
118
+ imgsz=imgsz,
119
+ batch_size=batch_size,
120
+ augment=False,
121
+ rank=-1,
122
+ workers=workers,
123
+ )
124
+
125
+ model.eval()
126
+ pred, targets, loss, dt = [], [], 0, (Profile(), Profile(), Profile())
127
+ n = len(dataloader) # number of batches
128
+ action = (
129
+ "validating" if dataloader.dataset.root.stem == "val" else "testing"
130
+ )
131
+ desc = f"{pbar.desc[:-36]}{action:>36}" if pbar else f"{action}"
132
+ bar = tqdm(
133
+ dataloader,
134
+ desc,
135
+ n,
136
+ not training,
137
+ bar_format=TQDM_BAR_FORMAT,
138
+ position=0,
139
+ )
140
+ with torch.cuda.amp.autocast(enabled=device.type != "cpu"):
141
+ for images, labels in bar:
142
+ with dt[0]:
143
+ images, labels = images.to(
144
+ device, non_blocking=True
145
+ ), labels.to(device)
146
+
147
+ with dt[1]:
148
+ y = model(images)
149
+
150
+ with dt[2]:
151
+ pred.append(y.argsort(1, descending=True)[:, :5])
152
+ targets.append(labels)
153
+ if criterion:
154
+ loss += criterion(y, labels)
155
+
156
+ loss /= n
157
+ pred, targets = torch.cat(pred), torch.cat(targets)
158
+ correct = (targets[:, None] == pred).float()
159
+ acc = torch.stack(
160
+ (correct[:, 0], correct.max(1).values), dim=1
161
+ ) # (top1, top5) accuracy
162
+ top1, top5 = acc.mean(0).tolist()
163
+
164
+ if pbar:
165
+ pbar.desc = f"{pbar.desc[:-36]}{loss:>12.3g}{top1:>12.3g}{top5:>12.3g}"
166
+ if verbose: # all classes
167
+ LOGGER.info(
168
+ f"{'Class':>24}{'Images':>12}{'top1_acc':>12}{'top5_acc':>12}"
169
+ )
170
+ LOGGER.info(
171
+ f"{'all':>24}{targets.shape[0]:>12}{top1:>12.3g}{top5:>12.3g}"
172
+ )
173
+ for i, c in model.names.items():
174
+ aci = acc[targets == i]
175
+ top1i, top5i = aci.mean(0).tolist()
176
+ LOGGER.info(
177
+ f"{c:>24}{aci.shape[0]:>12}{top1i:>12.3g}{top5i:>12.3g}"
178
+ )
179
+
180
+ # Print results
181
+ t = tuple(
182
+ x.t / len(dataloader.dataset.samples) * 1e3 for x in dt
183
+ ) # speeds per image
184
+ shape = (1, 3, imgsz, imgsz)
185
+ LOGGER.info(
186
+ f"Speed: %.1fms pre-process, %.1fms inference, %.1fms post-process per image at shape {shape}"
187
+ % t
188
+ )
189
+ LOGGER.info(f"Results saved to {colorstr('bold', save_dir)}")
190
+
191
+ return top1, top5, loss
192
+
193
+
194
+ def parse_opt():
195
+ parser = argparse.ArgumentParser()
196
+ parser.add_argument(
197
+ "--data",
198
+ type=str,
199
+ default=ROOT / "../datasets/mnist",
200
+ help="dataset path",
201
+ )
202
+ parser.add_argument(
203
+ "--weights",
204
+ nargs="+",
205
+ type=str,
206
+ default=ROOT / "yolov5s-cls.pt",
207
+ help="model.pt path(s)",
208
+ )
209
+ parser.add_argument(
210
+ "--batch-size", type=int, default=128, help="batch size"
211
+ )
212
+ parser.add_argument(
213
+ "--imgsz",
214
+ "--img",
215
+ "--img-size",
216
+ type=int,
217
+ default=224,
218
+ help="inference size (pixels)",
219
+ )
220
+ parser.add_argument(
221
+ "--device", default="", help="cuda device, i.e. 0 or 0,1,2,3 or cpu"
222
+ )
223
+ parser.add_argument(
224
+ "--workers",
225
+ type=int,
226
+ default=8,
227
+ help="max dataloader workers (per RANK in DDP mode)",
228
+ )
229
+ parser.add_argument(
230
+ "--verbose", nargs="?", const=True, default=True, help="verbose output"
231
+ )
232
+ parser.add_argument(
233
+ "--project", default=ROOT / "runs/val-cls", help="save to project/name"
234
+ )
235
+ parser.add_argument("--name", default="exp", help="save to project/name")
236
+ parser.add_argument(
237
+ "--exist-ok",
238
+ action="store_true",
239
+ help="existing project/name ok, do not increment",
240
+ )
241
+ parser.add_argument(
242
+ "--half", action="store_true", help="use FP16 half-precision inference"
243
+ )
244
+ parser.add_argument(
245
+ "--dnn", action="store_true", help="use OpenCV DNN for ONNX inference"
246
+ )
247
+ opt = parser.parse_args()
248
+ print_args(vars(opt))
249
+ return opt
250
+
251
+
252
+ def main(opt):
253
+ check_requirements(exclude=("tensorboard", "thop"))
254
+ run(**vars(opt))
255
+
256
+
257
+ if __name__ == "__main__":
258
+ opt = parse_opt()
259
+ main(opt)
data/Argoverse.yaml ADDED
@@ -0,0 +1,74 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
2
+ # Argoverse-HD dataset (ring-front-center camera) http://www.cs.cmu.edu/~mengtial/proj/streaming/ by Argo AI
3
+ # Example usage: python train.py --data Argoverse.yaml
4
+ # parent
5
+ # ├── yolov5
6
+ # └── datasets
7
+ # └── Argoverse ← downloads here (31.3 GB)
8
+
9
+
10
+ # Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
11
+ path: ../datasets/Argoverse # dataset root dir
12
+ train: Argoverse-1.1/images/train/ # train images (relative to 'path') 39384 images
13
+ val: Argoverse-1.1/images/val/ # val images (relative to 'path') 15062 images
14
+ test: Argoverse-1.1/images/test/ # test images (optional) https://eval.ai/web/challenges/challenge-page/800/overview
15
+
16
+ # Classes
17
+ names:
18
+ 0: person
19
+ 1: bicycle
20
+ 2: car
21
+ 3: motorcycle
22
+ 4: bus
23
+ 5: truck
24
+ 6: traffic_light
25
+ 7: stop_sign
26
+
27
+
28
+ # Download script/URL (optional) ---------------------------------------------------------------------------------------
29
+ download: |
30
+ import json
31
+
32
+ from tqdm import tqdm
33
+ from utils.general import download, Path
34
+
35
+
36
+ def argoverse2yolo(set):
37
+ labels = {}
38
+ a = json.load(open(set, "rb"))
39
+ for annot in tqdm(a['annotations'], desc=f"Converting {set} to YOLOv5 format..."):
40
+ img_id = annot['image_id']
41
+ img_name = a['images'][img_id]['name']
42
+ img_label_name = f'{img_name[:-3]}txt'
43
+
44
+ cls = annot['category_id'] # instance class id
45
+ x_center, y_center, width, height = annot['bbox']
46
+ x_center = (x_center + width / 2) / 1920.0 # offset and scale
47
+ y_center = (y_center + height / 2) / 1200.0 # offset and scale
48
+ width /= 1920.0 # scale
49
+ height /= 1200.0 # scale
50
+
51
+ img_dir = set.parents[2] / 'Argoverse-1.1' / 'labels' / a['seq_dirs'][a['images'][annot['image_id']]['sid']]
52
+ if not img_dir.exists():
53
+ img_dir.mkdir(parents=True, exist_ok=True)
54
+
55
+ k = str(img_dir / img_label_name)
56
+ if k not in labels:
57
+ labels[k] = []
58
+ labels[k].append(f"{cls} {x_center} {y_center} {width} {height}\n")
59
+
60
+ for k in labels:
61
+ with open(k, "w") as f:
62
+ f.writelines(labels[k])
63
+
64
+
65
+ # Download
66
+ dir = Path(yaml['path']) # dataset root dir
67
+ urls = ['https://argoverse-hd.s3.us-east-2.amazonaws.com/Argoverse-HD-Full.zip']
68
+ download(urls, dir=dir, delete=False)
69
+
70
+ # Convert
71
+ annotations_dir = 'Argoverse-HD/annotations/'
72
+ (dir / 'Argoverse-1.1' / 'tracking').rename(dir / 'Argoverse-1.1' / 'images') # rename 'tracking' to 'images'
73
+ for d in "train.json", "val.json":
74
+ argoverse2yolo(dir / annotations_dir / d) # convert VisDrone annotations to YOLO labels
data/GlobalWheat2020.yaml ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
2
+ # Global Wheat 2020 dataset http://www.global-wheat.com/ by University of Saskatchewan
3
+ # Example usage: python train.py --data GlobalWheat2020.yaml
4
+ # parent
5
+ # ├── yolov5
6
+ # └── datasets
7
+ # └── GlobalWheat2020 ← downloads here (7.0 GB)
8
+
9
+
10
+ # Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
11
+ path: ../datasets/GlobalWheat2020 # dataset root dir
12
+ train: # train images (relative to 'path') 3422 images
13
+ - images/arvalis_1
14
+ - images/arvalis_2
15
+ - images/arvalis_3
16
+ - images/ethz_1
17
+ - images/rres_1
18
+ - images/inrae_1
19
+ - images/usask_1
20
+ val: # val images (relative to 'path') 748 images (WARNING: train set contains ethz_1)
21
+ - images/ethz_1
22
+ test: # test images (optional) 1276 images
23
+ - images/utokyo_1
24
+ - images/utokyo_2
25
+ - images/nau_1
26
+ - images/uq_1
27
+
28
+ # Classes
29
+ names:
30
+ 0: wheat_head
31
+
32
+
33
+ # Download script/URL (optional) ---------------------------------------------------------------------------------------
34
+ download: |
35
+ from utils.general import download, Path
36
+
37
+
38
+ # Download
39
+ dir = Path(yaml['path']) # dataset root dir
40
+ urls = ['https://zenodo.org/record/4298502/files/global-wheat-codalab-official.zip',
41
+ 'https://github.com/ultralytics/yolov5/releases/download/v1.0/GlobalWheat2020_labels.zip']
42
+ download(urls, dir=dir)
43
+
44
+ # Make Directories
45
+ for p in 'annotations', 'images', 'labels':
46
+ (dir / p).mkdir(parents=True, exist_ok=True)
47
+
48
+ # Move
49
+ for p in 'arvalis_1', 'arvalis_2', 'arvalis_3', 'ethz_1', 'rres_1', 'inrae_1', 'usask_1', \
50
+ 'utokyo_1', 'utokyo_2', 'nau_1', 'uq_1':
51
+ (dir / p).rename(dir / 'images' / p) # move to /images
52
+ f = (dir / p).with_suffix('.json') # json file
53
+ if f.exists():
54
+ f.rename((dir / 'annotations' / p).with_suffix('.json')) # move to /annotations
data/ImageNet.yaml ADDED
@@ -0,0 +1,1022 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
2
+ # ImageNet-1k dataset https://www.image-net.org/index.php by Stanford University
3
+ # Simplified class names from https://github.com/anishathalye/imagenet-simple-labels
4
+ # Example usage: python classify/train.py --data imagenet
5
+ # parent
6
+ # ├── yolov5
7
+ # └── datasets
8
+ # └── imagenet ← downloads here (144 GB)
9
+
10
+
11
+ # Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
12
+ path: ../datasets/imagenet # dataset root dir
13
+ train: train # train images (relative to 'path') 1281167 images
14
+ val: val # val images (relative to 'path') 50000 images
15
+ test: # test images (optional)
16
+
17
+ # Classes
18
+ names:
19
+ 0: tench
20
+ 1: goldfish
21
+ 2: great white shark
22
+ 3: tiger shark
23
+ 4: hammerhead shark
24
+ 5: electric ray
25
+ 6: stingray
26
+ 7: cock
27
+ 8: hen
28
+ 9: ostrich
29
+ 10: brambling
30
+ 11: goldfinch
31
+ 12: house finch
32
+ 13: junco
33
+ 14: indigo bunting
34
+ 15: American robin
35
+ 16: bulbul
36
+ 17: jay
37
+ 18: magpie
38
+ 19: chickadee
39
+ 20: American dipper
40
+ 21: kite
41
+ 22: bald eagle
42
+ 23: vulture
43
+ 24: great grey owl
44
+ 25: fire salamander
45
+ 26: smooth newt
46
+ 27: newt
47
+ 28: spotted salamander
48
+ 29: axolotl
49
+ 30: American bullfrog
50
+ 31: tree frog
51
+ 32: tailed frog
52
+ 33: loggerhead sea turtle
53
+ 34: leatherback sea turtle
54
+ 35: mud turtle
55
+ 36: terrapin
56
+ 37: box turtle
57
+ 38: banded gecko
58
+ 39: green iguana
59
+ 40: Carolina anole
60
+ 41: desert grassland whiptail lizard
61
+ 42: agama
62
+ 43: frilled-necked lizard
63
+ 44: alligator lizard
64
+ 45: Gila monster
65
+ 46: European green lizard
66
+ 47: chameleon
67
+ 48: Komodo dragon
68
+ 49: Nile crocodile
69
+ 50: American alligator
70
+ 51: triceratops
71
+ 52: worm snake
72
+ 53: ring-necked snake
73
+ 54: eastern hog-nosed snake
74
+ 55: smooth green snake
75
+ 56: kingsnake
76
+ 57: garter snake
77
+ 58: water snake
78
+ 59: vine snake
79
+ 60: night snake
80
+ 61: boa constrictor
81
+ 62: African rock python
82
+ 63: Indian cobra
83
+ 64: green mamba
84
+ 65: sea snake
85
+ 66: Saharan horned viper
86
+ 67: eastern diamondback rattlesnake
87
+ 68: sidewinder
88
+ 69: trilobite
89
+ 70: harvestman
90
+ 71: scorpion
91
+ 72: yellow garden spider
92
+ 73: barn spider
93
+ 74: European garden spider
94
+ 75: southern black widow
95
+ 76: tarantula
96
+ 77: wolf spider
97
+ 78: tick
98
+ 79: centipede
99
+ 80: black grouse
100
+ 81: ptarmigan
101
+ 82: ruffed grouse
102
+ 83: prairie grouse
103
+ 84: peacock
104
+ 85: quail
105
+ 86: partridge
106
+ 87: grey parrot
107
+ 88: macaw
108
+ 89: sulphur-crested cockatoo
109
+ 90: lorikeet
110
+ 91: coucal
111
+ 92: bee eater
112
+ 93: hornbill
113
+ 94: hummingbird
114
+ 95: jacamar
115
+ 96: toucan
116
+ 97: duck
117
+ 98: red-breasted merganser
118
+ 99: goose
119
+ 100: black swan
120
+ 101: tusker
121
+ 102: echidna
122
+ 103: platypus
123
+ 104: wallaby
124
+ 105: koala
125
+ 106: wombat
126
+ 107: jellyfish
127
+ 108: sea anemone
128
+ 109: brain coral
129
+ 110: flatworm
130
+ 111: nematode
131
+ 112: conch
132
+ 113: snail
133
+ 114: slug
134
+ 115: sea slug
135
+ 116: chiton
136
+ 117: chambered nautilus
137
+ 118: Dungeness crab
138
+ 119: rock crab
139
+ 120: fiddler crab
140
+ 121: red king crab
141
+ 122: American lobster
142
+ 123: spiny lobster
143
+ 124: crayfish
144
+ 125: hermit crab
145
+ 126: isopod
146
+ 127: white stork
147
+ 128: black stork
148
+ 129: spoonbill
149
+ 130: flamingo
150
+ 131: little blue heron
151
+ 132: great egret
152
+ 133: bittern
153
+ 134: crane (bird)
154
+ 135: limpkin
155
+ 136: common gallinule
156
+ 137: American coot
157
+ 138: bustard
158
+ 139: ruddy turnstone
159
+ 140: dunlin
160
+ 141: common redshank
161
+ 142: dowitcher
162
+ 143: oystercatcher
163
+ 144: pelican
164
+ 145: king penguin
165
+ 146: albatross
166
+ 147: grey whale
167
+ 148: killer whale
168
+ 149: dugong
169
+ 150: sea lion
170
+ 151: Chihuahua
171
+ 152: Japanese Chin
172
+ 153: Maltese
173
+ 154: Pekingese
174
+ 155: Shih Tzu
175
+ 156: King Charles Spaniel
176
+ 157: Papillon
177
+ 158: toy terrier
178
+ 159: Rhodesian Ridgeback
179
+ 160: Afghan Hound
180
+ 161: Basset Hound
181
+ 162: Beagle
182
+ 163: Bloodhound
183
+ 164: Bluetick Coonhound
184
+ 165: Black and Tan Coonhound
185
+ 166: Treeing Walker Coonhound
186
+ 167: English foxhound
187
+ 168: Redbone Coonhound
188
+ 169: borzoi
189
+ 170: Irish Wolfhound
190
+ 171: Italian Greyhound
191
+ 172: Whippet
192
+ 173: Ibizan Hound
193
+ 174: Norwegian Elkhound
194
+ 175: Otterhound
195
+ 176: Saluki
196
+ 177: Scottish Deerhound
197
+ 178: Weimaraner
198
+ 179: Staffordshire Bull Terrier
199
+ 180: American Staffordshire Terrier
200
+ 181: Bedlington Terrier
201
+ 182: Border Terrier
202
+ 183: Kerry Blue Terrier
203
+ 184: Irish Terrier
204
+ 185: Norfolk Terrier
205
+ 186: Norwich Terrier
206
+ 187: Yorkshire Terrier
207
+ 188: Wire Fox Terrier
208
+ 189: Lakeland Terrier
209
+ 190: Sealyham Terrier
210
+ 191: Airedale Terrier
211
+ 192: Cairn Terrier
212
+ 193: Australian Terrier
213
+ 194: Dandie Dinmont Terrier
214
+ 195: Boston Terrier
215
+ 196: Miniature Schnauzer
216
+ 197: Giant Schnauzer
217
+ 198: Standard Schnauzer
218
+ 199: Scottish Terrier
219
+ 200: Tibetan Terrier
220
+ 201: Australian Silky Terrier
221
+ 202: Soft-coated Wheaten Terrier
222
+ 203: West Highland White Terrier
223
+ 204: Lhasa Apso
224
+ 205: Flat-Coated Retriever
225
+ 206: Curly-coated Retriever
226
+ 207: Golden Retriever
227
+ 208: Labrador Retriever
228
+ 209: Chesapeake Bay Retriever
229
+ 210: German Shorthaired Pointer
230
+ 211: Vizsla
231
+ 212: English Setter
232
+ 213: Irish Setter
233
+ 214: Gordon Setter
234
+ 215: Brittany
235
+ 216: Clumber Spaniel
236
+ 217: English Springer Spaniel
237
+ 218: Welsh Springer Spaniel
238
+ 219: Cocker Spaniels
239
+ 220: Sussex Spaniel
240
+ 221: Irish Water Spaniel
241
+ 222: Kuvasz
242
+ 223: Schipperke
243
+ 224: Groenendael
244
+ 225: Malinois
245
+ 226: Briard
246
+ 227: Australian Kelpie
247
+ 228: Komondor
248
+ 229: Old English Sheepdog
249
+ 230: Shetland Sheepdog
250
+ 231: collie
251
+ 232: Border Collie
252
+ 233: Bouvier des Flandres
253
+ 234: Rottweiler
254
+ 235: German Shepherd Dog
255
+ 236: Dobermann
256
+ 237: Miniature Pinscher
257
+ 238: Greater Swiss Mountain Dog
258
+ 239: Bernese Mountain Dog
259
+ 240: Appenzeller Sennenhund
260
+ 241: Entlebucher Sennenhund
261
+ 242: Boxer
262
+ 243: Bullmastiff
263
+ 244: Tibetan Mastiff
264
+ 245: French Bulldog
265
+ 246: Great Dane
266
+ 247: St. Bernard
267
+ 248: husky
268
+ 249: Alaskan Malamute
269
+ 250: Siberian Husky
270
+ 251: Dalmatian
271
+ 252: Affenpinscher
272
+ 253: Basenji
273
+ 254: pug
274
+ 255: Leonberger
275
+ 256: Newfoundland
276
+ 257: Pyrenean Mountain Dog
277
+ 258: Samoyed
278
+ 259: Pomeranian
279
+ 260: Chow Chow
280
+ 261: Keeshond
281
+ 262: Griffon Bruxellois
282
+ 263: Pembroke Welsh Corgi
283
+ 264: Cardigan Welsh Corgi
284
+ 265: Toy Poodle
285
+ 266: Miniature Poodle
286
+ 267: Standard Poodle
287
+ 268: Mexican hairless dog
288
+ 269: grey wolf
289
+ 270: Alaskan tundra wolf
290
+ 271: red wolf
291
+ 272: coyote
292
+ 273: dingo
293
+ 274: dhole
294
+ 275: African wild dog
295
+ 276: hyena
296
+ 277: red fox
297
+ 278: kit fox
298
+ 279: Arctic fox
299
+ 280: grey fox
300
+ 281: tabby cat
301
+ 282: tiger cat
302
+ 283: Persian cat
303
+ 284: Siamese cat
304
+ 285: Egyptian Mau
305
+ 286: cougar
306
+ 287: lynx
307
+ 288: leopard
308
+ 289: snow leopard
309
+ 290: jaguar
310
+ 291: lion
311
+ 292: tiger
312
+ 293: cheetah
313
+ 294: brown bear
314
+ 295: American black bear
315
+ 296: polar bear
316
+ 297: sloth bear
317
+ 298: mongoose
318
+ 299: meerkat
319
+ 300: tiger beetle
320
+ 301: ladybug
321
+ 302: ground beetle
322
+ 303: longhorn beetle
323
+ 304: leaf beetle
324
+ 305: dung beetle
325
+ 306: rhinoceros beetle
326
+ 307: weevil
327
+ 308: fly
328
+ 309: bee
329
+ 310: ant
330
+ 311: grasshopper
331
+ 312: cricket
332
+ 313: stick insect
333
+ 314: cockroach
334
+ 315: mantis
335
+ 316: cicada
336
+ 317: leafhopper
337
+ 318: lacewing
338
+ 319: dragonfly
339
+ 320: damselfly
340
+ 321: red admiral
341
+ 322: ringlet
342
+ 323: monarch butterfly
343
+ 324: small white
344
+ 325: sulphur butterfly
345
+ 326: gossamer-winged butterfly
346
+ 327: starfish
347
+ 328: sea urchin
348
+ 329: sea cucumber
349
+ 330: cottontail rabbit
350
+ 331: hare
351
+ 332: Angora rabbit
352
+ 333: hamster
353
+ 334: porcupine
354
+ 335: fox squirrel
355
+ 336: marmot
356
+ 337: beaver
357
+ 338: guinea pig
358
+ 339: common sorrel
359
+ 340: zebra
360
+ 341: pig
361
+ 342: wild boar
362
+ 343: warthog
363
+ 344: hippopotamus
364
+ 345: ox
365
+ 346: water buffalo
366
+ 347: bison
367
+ 348: ram
368
+ 349: bighorn sheep
369
+ 350: Alpine ibex
370
+ 351: hartebeest
371
+ 352: impala
372
+ 353: gazelle
373
+ 354: dromedary
374
+ 355: llama
375
+ 356: weasel
376
+ 357: mink
377
+ 358: European polecat
378
+ 359: black-footed ferret
379
+ 360: otter
380
+ 361: skunk
381
+ 362: badger
382
+ 363: armadillo
383
+ 364: three-toed sloth
384
+ 365: orangutan
385
+ 366: gorilla
386
+ 367: chimpanzee
387
+ 368: gibbon
388
+ 369: siamang
389
+ 370: guenon
390
+ 371: patas monkey
391
+ 372: baboon
392
+ 373: macaque
393
+ 374: langur
394
+ 375: black-and-white colobus
395
+ 376: proboscis monkey
396
+ 377: marmoset
397
+ 378: white-headed capuchin
398
+ 379: howler monkey
399
+ 380: titi
400
+ 381: Geoffroy's spider monkey
401
+ 382: common squirrel monkey
402
+ 383: ring-tailed lemur
403
+ 384: indri
404
+ 385: Asian elephant
405
+ 386: African bush elephant
406
+ 387: red panda
407
+ 388: giant panda
408
+ 389: snoek
409
+ 390: eel
410
+ 391: coho salmon
411
+ 392: rock beauty
412
+ 393: clownfish
413
+ 394: sturgeon
414
+ 395: garfish
415
+ 396: lionfish
416
+ 397: pufferfish
417
+ 398: abacus
418
+ 399: abaya
419
+ 400: academic gown
420
+ 401: accordion
421
+ 402: acoustic guitar
422
+ 403: aircraft carrier
423
+ 404: airliner
424
+ 405: airship
425
+ 406: altar
426
+ 407: ambulance
427
+ 408: amphibious vehicle
428
+ 409: analog clock
429
+ 410: apiary
430
+ 411: apron
431
+ 412: waste container
432
+ 413: assault rifle
433
+ 414: backpack
434
+ 415: bakery
435
+ 416: balance beam
436
+ 417: balloon
437
+ 418: ballpoint pen
438
+ 419: Band-Aid
439
+ 420: banjo
440
+ 421: baluster
441
+ 422: barbell
442
+ 423: barber chair
443
+ 424: barbershop
444
+ 425: barn
445
+ 426: barometer
446
+ 427: barrel
447
+ 428: wheelbarrow
448
+ 429: baseball
449
+ 430: basketball
450
+ 431: bassinet
451
+ 432: bassoon
452
+ 433: swimming cap
453
+ 434: bath towel
454
+ 435: bathtub
455
+ 436: station wagon
456
+ 437: lighthouse
457
+ 438: beaker
458
+ 439: military cap
459
+ 440: beer bottle
460
+ 441: beer glass
461
+ 442: bell-cot
462
+ 443: bib
463
+ 444: tandem bicycle
464
+ 445: bikini
465
+ 446: ring binder
466
+ 447: binoculars
467
+ 448: birdhouse
468
+ 449: boathouse
469
+ 450: bobsleigh
470
+ 451: bolo tie
471
+ 452: poke bonnet
472
+ 453: bookcase
473
+ 454: bookstore
474
+ 455: bottle cap
475
+ 456: bow
476
+ 457: bow tie
477
+ 458: brass
478
+ 459: bra
479
+ 460: breakwater
480
+ 461: breastplate
481
+ 462: broom
482
+ 463: bucket
483
+ 464: buckle
484
+ 465: bulletproof vest
485
+ 466: high-speed train
486
+ 467: butcher shop
487
+ 468: taxicab
488
+ 469: cauldron
489
+ 470: candle
490
+ 471: cannon
491
+ 472: canoe
492
+ 473: can opener
493
+ 474: cardigan
494
+ 475: car mirror
495
+ 476: carousel
496
+ 477: tool kit
497
+ 478: carton
498
+ 479: car wheel
499
+ 480: automated teller machine
500
+ 481: cassette
501
+ 482: cassette player
502
+ 483: castle
503
+ 484: catamaran
504
+ 485: CD player
505
+ 486: cello
506
+ 487: mobile phone
507
+ 488: chain
508
+ 489: chain-link fence
509
+ 490: chain mail
510
+ 491: chainsaw
511
+ 492: chest
512
+ 493: chiffonier
513
+ 494: chime
514
+ 495: china cabinet
515
+ 496: Christmas stocking
516
+ 497: church
517
+ 498: movie theater
518
+ 499: cleaver
519
+ 500: cliff dwelling
520
+ 501: cloak
521
+ 502: clogs
522
+ 503: cocktail shaker
523
+ 504: coffee mug
524
+ 505: coffeemaker
525
+ 506: coil
526
+ 507: combination lock
527
+ 508: computer keyboard
528
+ 509: confectionery store
529
+ 510: container ship
530
+ 511: convertible
531
+ 512: corkscrew
532
+ 513: cornet
533
+ 514: cowboy boot
534
+ 515: cowboy hat
535
+ 516: cradle
536
+ 517: crane (machine)
537
+ 518: crash helmet
538
+ 519: crate
539
+ 520: infant bed
540
+ 521: Crock Pot
541
+ 522: croquet ball
542
+ 523: crutch
543
+ 524: cuirass
544
+ 525: dam
545
+ 526: desk
546
+ 527: desktop computer
547
+ 528: rotary dial telephone
548
+ 529: diaper
549
+ 530: digital clock
550
+ 531: digital watch
551
+ 532: dining table
552
+ 533: dishcloth
553
+ 534: dishwasher
554
+ 535: disc brake
555
+ 536: dock
556
+ 537: dog sled
557
+ 538: dome
558
+ 539: doormat
559
+ 540: drilling rig
560
+ 541: drum
561
+ 542: drumstick
562
+ 543: dumbbell
563
+ 544: Dutch oven
564
+ 545: electric fan
565
+ 546: electric guitar
566
+ 547: electric locomotive
567
+ 548: entertainment center
568
+ 549: envelope
569
+ 550: espresso machine
570
+ 551: face powder
571
+ 552: feather boa
572
+ 553: filing cabinet
573
+ 554: fireboat
574
+ 555: fire engine
575
+ 556: fire screen sheet
576
+ 557: flagpole
577
+ 558: flute
578
+ 559: folding chair
579
+ 560: football helmet
580
+ 561: forklift
581
+ 562: fountain
582
+ 563: fountain pen
583
+ 564: four-poster bed
584
+ 565: freight car
585
+ 566: French horn
586
+ 567: frying pan
587
+ 568: fur coat
588
+ 569: garbage truck
589
+ 570: gas mask
590
+ 571: gas pump
591
+ 572: goblet
592
+ 573: go-kart
593
+ 574: golf ball
594
+ 575: golf cart
595
+ 576: gondola
596
+ 577: gong
597
+ 578: gown
598
+ 579: grand piano
599
+ 580: greenhouse
600
+ 581: grille
601
+ 582: grocery store
602
+ 583: guillotine
603
+ 584: barrette
604
+ 585: hair spray
605
+ 586: half-track
606
+ 587: hammer
607
+ 588: hamper
608
+ 589: hair dryer
609
+ 590: hand-held computer
610
+ 591: handkerchief
611
+ 592: hard disk drive
612
+ 593: harmonica
613
+ 594: harp
614
+ 595: harvester
615
+ 596: hatchet
616
+ 597: holster
617
+ 598: home theater
618
+ 599: honeycomb
619
+ 600: hook
620
+ 601: hoop skirt
621
+ 602: horizontal bar
622
+ 603: horse-drawn vehicle
623
+ 604: hourglass
624
+ 605: iPod
625
+ 606: clothes iron
626
+ 607: jack-o'-lantern
627
+ 608: jeans
628
+ 609: jeep
629
+ 610: T-shirt
630
+ 611: jigsaw puzzle
631
+ 612: pulled rickshaw
632
+ 613: joystick
633
+ 614: kimono
634
+ 615: knee pad
635
+ 616: knot
636
+ 617: lab coat
637
+ 618: ladle
638
+ 619: lampshade
639
+ 620: laptop computer
640
+ 621: lawn mower
641
+ 622: lens cap
642
+ 623: paper knife
643
+ 624: library
644
+ 625: lifeboat
645
+ 626: lighter
646
+ 627: limousine
647
+ 628: ocean liner
648
+ 629: lipstick
649
+ 630: slip-on shoe
650
+ 631: lotion
651
+ 632: speaker
652
+ 633: loupe
653
+ 634: sawmill
654
+ 635: magnetic compass
655
+ 636: mail bag
656
+ 637: mailbox
657
+ 638: tights
658
+ 639: tank suit
659
+ 640: manhole cover
660
+ 641: maraca
661
+ 642: marimba
662
+ 643: mask
663
+ 644: match
664
+ 645: maypole
665
+ 646: maze
666
+ 647: measuring cup
667
+ 648: medicine chest
668
+ 649: megalith
669
+ 650: microphone
670
+ 651: microwave oven
671
+ 652: military uniform
672
+ 653: milk can
673
+ 654: minibus
674
+ 655: miniskirt
675
+ 656: minivan
676
+ 657: missile
677
+ 658: mitten
678
+ 659: mixing bowl
679
+ 660: mobile home
680
+ 661: Model T
681
+ 662: modem
682
+ 663: monastery
683
+ 664: monitor
684
+ 665: moped
685
+ 666: mortar
686
+ 667: square academic cap
687
+ 668: mosque
688
+ 669: mosquito net
689
+ 670: scooter
690
+ 671: mountain bike
691
+ 672: tent
692
+ 673: computer mouse
693
+ 674: mousetrap
694
+ 675: moving van
695
+ 676: muzzle
696
+ 677: nail
697
+ 678: neck brace
698
+ 679: necklace
699
+ 680: nipple
700
+ 681: notebook computer
701
+ 682: obelisk
702
+ 683: oboe
703
+ 684: ocarina
704
+ 685: odometer
705
+ 686: oil filter
706
+ 687: organ
707
+ 688: oscilloscope
708
+ 689: overskirt
709
+ 690: bullock cart
710
+ 691: oxygen mask
711
+ 692: packet
712
+ 693: paddle
713
+ 694: paddle wheel
714
+ 695: padlock
715
+ 696: paintbrush
716
+ 697: pajamas
717
+ 698: palace
718
+ 699: pan flute
719
+ 700: paper towel
720
+ 701: parachute
721
+ 702: parallel bars
722
+ 703: park bench
723
+ 704: parking meter
724
+ 705: passenger car
725
+ 706: patio
726
+ 707: payphone
727
+ 708: pedestal
728
+ 709: pencil case
729
+ 710: pencil sharpener
730
+ 711: perfume
731
+ 712: Petri dish
732
+ 713: photocopier
733
+ 714: plectrum
734
+ 715: Pickelhaube
735
+ 716: picket fence
736
+ 717: pickup truck
737
+ 718: pier
738
+ 719: piggy bank
739
+ 720: pill bottle
740
+ 721: pillow
741
+ 722: ping-pong ball
742
+ 723: pinwheel
743
+ 724: pirate ship
744
+ 725: pitcher
745
+ 726: hand plane
746
+ 727: planetarium
747
+ 728: plastic bag
748
+ 729: plate rack
749
+ 730: plow
750
+ 731: plunger
751
+ 732: Polaroid camera
752
+ 733: pole
753
+ 734: police van
754
+ 735: poncho
755
+ 736: billiard table
756
+ 737: soda bottle
757
+ 738: pot
758
+ 739: potter's wheel
759
+ 740: power drill
760
+ 741: prayer rug
761
+ 742: printer
762
+ 743: prison
763
+ 744: projectile
764
+ 745: projector
765
+ 746: hockey puck
766
+ 747: punching bag
767
+ 748: purse
768
+ 749: quill
769
+ 750: quilt
770
+ 751: race car
771
+ 752: racket
772
+ 753: radiator
773
+ 754: radio
774
+ 755: radio telescope
775
+ 756: rain barrel
776
+ 757: recreational vehicle
777
+ 758: reel
778
+ 759: reflex camera
779
+ 760: refrigerator
780
+ 761: remote control
781
+ 762: restaurant
782
+ 763: revolver
783
+ 764: rifle
784
+ 765: rocking chair
785
+ 766: rotisserie
786
+ 767: eraser
787
+ 768: rugby ball
788
+ 769: ruler
789
+ 770: running shoe
790
+ 771: safe
791
+ 772: safety pin
792
+ 773: salt shaker
793
+ 774: sandal
794
+ 775: sarong
795
+ 776: saxophone
796
+ 777: scabbard
797
+ 778: weighing scale
798
+ 779: school bus
799
+ 780: schooner
800
+ 781: scoreboard
801
+ 782: CRT screen
802
+ 783: screw
803
+ 784: screwdriver
804
+ 785: seat belt
805
+ 786: sewing machine
806
+ 787: shield
807
+ 788: shoe store
808
+ 789: shoji
809
+ 790: shopping basket
810
+ 791: shopping cart
811
+ 792: shovel
812
+ 793: shower cap
813
+ 794: shower curtain
814
+ 795: ski
815
+ 796: ski mask
816
+ 797: sleeping bag
817
+ 798: slide rule
818
+ 799: sliding door
819
+ 800: slot machine
820
+ 801: snorkel
821
+ 802: snowmobile
822
+ 803: snowplow
823
+ 804: soap dispenser
824
+ 805: soccer ball
825
+ 806: sock
826
+ 807: solar thermal collector
827
+ 808: sombrero
828
+ 809: soup bowl
829
+ 810: space bar
830
+ 811: space heater
831
+ 812: space shuttle
832
+ 813: spatula
833
+ 814: motorboat
834
+ 815: spider web
835
+ 816: spindle
836
+ 817: sports car
837
+ 818: spotlight
838
+ 819: stage
839
+ 820: steam locomotive
840
+ 821: through arch bridge
841
+ 822: steel drum
842
+ 823: stethoscope
843
+ 824: scarf
844
+ 825: stone wall
845
+ 826: stopwatch
846
+ 827: stove
847
+ 828: strainer
848
+ 829: tram
849
+ 830: stretcher
850
+ 831: couch
851
+ 832: stupa
852
+ 833: submarine
853
+ 834: suit
854
+ 835: sundial
855
+ 836: sunglass
856
+ 837: sunglasses
857
+ 838: sunscreen
858
+ 839: suspension bridge
859
+ 840: mop
860
+ 841: sweatshirt
861
+ 842: swimsuit
862
+ 843: swing
863
+ 844: switch
864
+ 845: syringe
865
+ 846: table lamp
866
+ 847: tank
867
+ 848: tape player
868
+ 849: teapot
869
+ 850: teddy bear
870
+ 851: television
871
+ 852: tennis ball
872
+ 853: thatched roof
873
+ 854: front curtain
874
+ 855: thimble
875
+ 856: threshing machine
876
+ 857: throne
877
+ 858: tile roof
878
+ 859: toaster
879
+ 860: tobacco shop
880
+ 861: toilet seat
881
+ 862: torch
882
+ 863: totem pole
883
+ 864: tow truck
884
+ 865: toy store
885
+ 866: tractor
886
+ 867: semi-trailer truck
887
+ 868: tray
888
+ 869: trench coat
889
+ 870: tricycle
890
+ 871: trimaran
891
+ 872: tripod
892
+ 873: triumphal arch
893
+ 874: trolleybus
894
+ 875: trombone
895
+ 876: tub
896
+ 877: turnstile
897
+ 878: typewriter keyboard
898
+ 879: umbrella
899
+ 880: unicycle
900
+ 881: upright piano
901
+ 882: vacuum cleaner
902
+ 883: vase
903
+ 884: vault
904
+ 885: velvet
905
+ 886: vending machine
906
+ 887: vestment
907
+ 888: viaduct
908
+ 889: violin
909
+ 890: volleyball
910
+ 891: waffle iron
911
+ 892: wall clock
912
+ 893: wallet
913
+ 894: wardrobe
914
+ 895: military aircraft
915
+ 896: sink
916
+ 897: washing machine
917
+ 898: water bottle
918
+ 899: water jug
919
+ 900: water tower
920
+ 901: whiskey jug
921
+ 902: whistle
922
+ 903: wig
923
+ 904: window screen
924
+ 905: window shade
925
+ 906: Windsor tie
926
+ 907: wine bottle
927
+ 908: wing
928
+ 909: wok
929
+ 910: wooden spoon
930
+ 911: wool
931
+ 912: split-rail fence
932
+ 913: shipwreck
933
+ 914: yawl
934
+ 915: yurt
935
+ 916: website
936
+ 917: comic book
937
+ 918: crossword
938
+ 919: traffic sign
939
+ 920: traffic light
940
+ 921: dust jacket
941
+ 922: menu
942
+ 923: plate
943
+ 924: guacamole
944
+ 925: consomme
945
+ 926: hot pot
946
+ 927: trifle
947
+ 928: ice cream
948
+ 929: ice pop
949
+ 930: baguette
950
+ 931: bagel
951
+ 932: pretzel
952
+ 933: cheeseburger
953
+ 934: hot dog
954
+ 935: mashed potato
955
+ 936: cabbage
956
+ 937: broccoli
957
+ 938: cauliflower
958
+ 939: zucchini
959
+ 940: spaghetti squash
960
+ 941: acorn squash
961
+ 942: butternut squash
962
+ 943: cucumber
963
+ 944: artichoke
964
+ 945: bell pepper
965
+ 946: cardoon
966
+ 947: mushroom
967
+ 948: Granny Smith
968
+ 949: strawberry
969
+ 950: orange
970
+ 951: lemon
971
+ 952: fig
972
+ 953: pineapple
973
+ 954: banana
974
+ 955: jackfruit
975
+ 956: custard apple
976
+ 957: pomegranate
977
+ 958: hay
978
+ 959: carbonara
979
+ 960: chocolate syrup
980
+ 961: dough
981
+ 962: meatloaf
982
+ 963: pizza
983
+ 964: pot pie
984
+ 965: burrito
985
+ 966: red wine
986
+ 967: espresso
987
+ 968: cup
988
+ 969: eggnog
989
+ 970: alp
990
+ 971: bubble
991
+ 972: cliff
992
+ 973: coral reef
993
+ 974: geyser
994
+ 975: lakeshore
995
+ 976: promontory
996
+ 977: shoal
997
+ 978: seashore
998
+ 979: valley
999
+ 980: volcano
1000
+ 981: baseball player
1001
+ 982: bridegroom
1002
+ 983: scuba diver
1003
+ 984: rapeseed
1004
+ 985: daisy
1005
+ 986: yellow lady's slipper
1006
+ 987: corn
1007
+ 988: acorn
1008
+ 989: rose hip
1009
+ 990: horse chestnut seed
1010
+ 991: coral fungus
1011
+ 992: agaric
1012
+ 993: gyromitra
1013
+ 994: stinkhorn mushroom
1014
+ 995: earth star
1015
+ 996: hen-of-the-woods
1016
+ 997: bolete
1017
+ 998: ear
1018
+ 999: toilet paper
1019
+
1020
+
1021
+ # Download script/URL (optional)
1022
+ download: data/scripts/get_imagenet.sh
data/Objects365.yaml ADDED
@@ -0,0 +1,438 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
2
+ # Objects365 dataset https://www.objects365.org/ by Megvii
3
+ # Example usage: python train.py --data Objects365.yaml
4
+ # parent
5
+ # ├── yolov5
6
+ # └── datasets
7
+ # └── Objects365 ← downloads here (712 GB = 367G data + 345G zips)
8
+
9
+
10
+ # Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
11
+ path: ../datasets/Objects365 # dataset root dir
12
+ train: images/train # train images (relative to 'path') 1742289 images
13
+ val: images/val # val images (relative to 'path') 80000 images
14
+ test: # test images (optional)
15
+
16
+ # Classes
17
+ names:
18
+ 0: Person
19
+ 1: Sneakers
20
+ 2: Chair
21
+ 3: Other Shoes
22
+ 4: Hat
23
+ 5: Car
24
+ 6: Lamp
25
+ 7: Glasses
26
+ 8: Bottle
27
+ 9: Desk
28
+ 10: Cup
29
+ 11: Street Lights
30
+ 12: Cabinet/shelf
31
+ 13: Handbag/Satchel
32
+ 14: Bracelet
33
+ 15: Plate
34
+ 16: Picture/Frame
35
+ 17: Helmet
36
+ 18: Book
37
+ 19: Gloves
38
+ 20: Storage box
39
+ 21: Boat
40
+ 22: Leather Shoes
41
+ 23: Flower
42
+ 24: Bench
43
+ 25: Potted Plant
44
+ 26: Bowl/Basin
45
+ 27: Flag
46
+ 28: Pillow
47
+ 29: Boots
48
+ 30: Vase
49
+ 31: Microphone
50
+ 32: Necklace
51
+ 33: Ring
52
+ 34: SUV
53
+ 35: Wine Glass
54
+ 36: Belt
55
+ 37: Monitor/TV
56
+ 38: Backpack
57
+ 39: Umbrella
58
+ 40: Traffic Light
59
+ 41: Speaker
60
+ 42: Watch
61
+ 43: Tie
62
+ 44: Trash bin Can
63
+ 45: Slippers
64
+ 46: Bicycle
65
+ 47: Stool
66
+ 48: Barrel/bucket
67
+ 49: Van
68
+ 50: Couch
69
+ 51: Sandals
70
+ 52: Basket
71
+ 53: Drum
72
+ 54: Pen/Pencil
73
+ 55: Bus
74
+ 56: Wild Bird
75
+ 57: High Heels
76
+ 58: Motorcycle
77
+ 59: Guitar
78
+ 60: Carpet
79
+ 61: Cell Phone
80
+ 62: Bread
81
+ 63: Camera
82
+ 64: Canned
83
+ 65: Truck
84
+ 66: Traffic cone
85
+ 67: Cymbal
86
+ 68: Lifesaver
87
+ 69: Towel
88
+ 70: Stuffed Toy
89
+ 71: Candle
90
+ 72: Sailboat
91
+ 73: Laptop
92
+ 74: Awning
93
+ 75: Bed
94
+ 76: Faucet
95
+ 77: Tent
96
+ 78: Horse
97
+ 79: Mirror
98
+ 80: Power outlet
99
+ 81: Sink
100
+ 82: Apple
101
+ 83: Air Conditioner
102
+ 84: Knife
103
+ 85: Hockey Stick
104
+ 86: Paddle
105
+ 87: Pickup Truck
106
+ 88: Fork
107
+ 89: Traffic Sign
108
+ 90: Balloon
109
+ 91: Tripod
110
+ 92: Dog
111
+ 93: Spoon
112
+ 94: Clock
113
+ 95: Pot
114
+ 96: Cow
115
+ 97: Cake
116
+ 98: Dinning Table
117
+ 99: Sheep
118
+ 100: Hanger
119
+ 101: Blackboard/Whiteboard
120
+ 102: Napkin
121
+ 103: Other Fish
122
+ 104: Orange/Tangerine
123
+ 105: Toiletry
124
+ 106: Keyboard
125
+ 107: Tomato
126
+ 108: Lantern
127
+ 109: Machinery Vehicle
128
+ 110: Fan
129
+ 111: Green Vegetables
130
+ 112: Banana
131
+ 113: Baseball Glove
132
+ 114: Airplane
133
+ 115: Mouse
134
+ 116: Train
135
+ 117: Pumpkin
136
+ 118: Soccer
137
+ 119: Skiboard
138
+ 120: Luggage
139
+ 121: Nightstand
140
+ 122: Tea pot
141
+ 123: Telephone
142
+ 124: Trolley
143
+ 125: Head Phone
144
+ 126: Sports Car
145
+ 127: Stop Sign
146
+ 128: Dessert
147
+ 129: Scooter
148
+ 130: Stroller
149
+ 131: Crane
150
+ 132: Remote
151
+ 133: Refrigerator
152
+ 134: Oven
153
+ 135: Lemon
154
+ 136: Duck
155
+ 137: Baseball Bat
156
+ 138: Surveillance Camera
157
+ 139: Cat
158
+ 140: Jug
159
+ 141: Broccoli
160
+ 142: Piano
161
+ 143: Pizza
162
+ 144: Elephant
163
+ 145: Skateboard
164
+ 146: Surfboard
165
+ 147: Gun
166
+ 148: Skating and Skiing shoes
167
+ 149: Gas stove
168
+ 150: Donut
169
+ 151: Bow Tie
170
+ 152: Carrot
171
+ 153: Toilet
172
+ 154: Kite
173
+ 155: Strawberry
174
+ 156: Other Balls
175
+ 157: Shovel
176
+ 158: Pepper
177
+ 159: Computer Box
178
+ 160: Toilet Paper
179
+ 161: Cleaning Products
180
+ 162: Chopsticks
181
+ 163: Microwave
182
+ 164: Pigeon
183
+ 165: Baseball
184
+ 166: Cutting/chopping Board
185
+ 167: Coffee Table
186
+ 168: Side Table
187
+ 169: Scissors
188
+ 170: Marker
189
+ 171: Pie
190
+ 172: Ladder
191
+ 173: Snowboard
192
+ 174: Cookies
193
+ 175: Radiator
194
+ 176: Fire Hydrant
195
+ 177: Basketball
196
+ 178: Zebra
197
+ 179: Grape
198
+ 180: Giraffe
199
+ 181: Potato
200
+ 182: Sausage
201
+ 183: Tricycle
202
+ 184: Violin
203
+ 185: Egg
204
+ 186: Fire Extinguisher
205
+ 187: Candy
206
+ 188: Fire Truck
207
+ 189: Billiards
208
+ 190: Converter
209
+ 191: Bathtub
210
+ 192: Wheelchair
211
+ 193: Golf Club
212
+ 194: Briefcase
213
+ 195: Cucumber
214
+ 196: Cigar/Cigarette
215
+ 197: Paint Brush
216
+ 198: Pear
217
+ 199: Heavy Truck
218
+ 200: Hamburger
219
+ 201: Extractor
220
+ 202: Extension Cord
221
+ 203: Tong
222
+ 204: Tennis Racket
223
+ 205: Folder
224
+ 206: American Football
225
+ 207: earphone
226
+ 208: Mask
227
+ 209: Kettle
228
+ 210: Tennis
229
+ 211: Ship
230
+ 212: Swing
231
+ 213: Coffee Machine
232
+ 214: Slide
233
+ 215: Carriage
234
+ 216: Onion
235
+ 217: Green beans
236
+ 218: Projector
237
+ 219: Frisbee
238
+ 220: Washing Machine/Drying Machine
239
+ 221: Chicken
240
+ 222: Printer
241
+ 223: Watermelon
242
+ 224: Saxophone
243
+ 225: Tissue
244
+ 226: Toothbrush
245
+ 227: Ice cream
246
+ 228: Hot-air balloon
247
+ 229: Cello
248
+ 230: French Fries
249
+ 231: Scale
250
+ 232: Trophy
251
+ 233: Cabbage
252
+ 234: Hot dog
253
+ 235: Blender
254
+ 236: Peach
255
+ 237: Rice
256
+ 238: Wallet/Purse
257
+ 239: Volleyball
258
+ 240: Deer
259
+ 241: Goose
260
+ 242: Tape
261
+ 243: Tablet
262
+ 244: Cosmetics
263
+ 245: Trumpet
264
+ 246: Pineapple
265
+ 247: Golf Ball
266
+ 248: Ambulance
267
+ 249: Parking meter
268
+ 250: Mango
269
+ 251: Key
270
+ 252: Hurdle
271
+ 253: Fishing Rod
272
+ 254: Medal
273
+ 255: Flute
274
+ 256: Brush
275
+ 257: Penguin
276
+ 258: Megaphone
277
+ 259: Corn
278
+ 260: Lettuce
279
+ 261: Garlic
280
+ 262: Swan
281
+ 263: Helicopter
282
+ 264: Green Onion
283
+ 265: Sandwich
284
+ 266: Nuts
285
+ 267: Speed Limit Sign
286
+ 268: Induction Cooker
287
+ 269: Broom
288
+ 270: Trombone
289
+ 271: Plum
290
+ 272: Rickshaw
291
+ 273: Goldfish
292
+ 274: Kiwi fruit
293
+ 275: Router/modem
294
+ 276: Poker Card
295
+ 277: Toaster
296
+ 278: Shrimp
297
+ 279: Sushi
298
+ 280: Cheese
299
+ 281: Notepaper
300
+ 282: Cherry
301
+ 283: Pliers
302
+ 284: CD
303
+ 285: Pasta
304
+ 286: Hammer
305
+ 287: Cue
306
+ 288: Avocado
307
+ 289: Hamimelon
308
+ 290: Flask
309
+ 291: Mushroom
310
+ 292: Screwdriver
311
+ 293: Soap
312
+ 294: Recorder
313
+ 295: Bear
314
+ 296: Eggplant
315
+ 297: Board Eraser
316
+ 298: Coconut
317
+ 299: Tape Measure/Ruler
318
+ 300: Pig
319
+ 301: Showerhead
320
+ 302: Globe
321
+ 303: Chips
322
+ 304: Steak
323
+ 305: Crosswalk Sign
324
+ 306: Stapler
325
+ 307: Camel
326
+ 308: Formula 1
327
+ 309: Pomegranate
328
+ 310: Dishwasher
329
+ 311: Crab
330
+ 312: Hoverboard
331
+ 313: Meat ball
332
+ 314: Rice Cooker
333
+ 315: Tuba
334
+ 316: Calculator
335
+ 317: Papaya
336
+ 318: Antelope
337
+ 319: Parrot
338
+ 320: Seal
339
+ 321: Butterfly
340
+ 322: Dumbbell
341
+ 323: Donkey
342
+ 324: Lion
343
+ 325: Urinal
344
+ 326: Dolphin
345
+ 327: Electric Drill
346
+ 328: Hair Dryer
347
+ 329: Egg tart
348
+ 330: Jellyfish
349
+ 331: Treadmill
350
+ 332: Lighter
351
+ 333: Grapefruit
352
+ 334: Game board
353
+ 335: Mop
354
+ 336: Radish
355
+ 337: Baozi
356
+ 338: Target
357
+ 339: French
358
+ 340: Spring Rolls
359
+ 341: Monkey
360
+ 342: Rabbit
361
+ 343: Pencil Case
362
+ 344: Yak
363
+ 345: Red Cabbage
364
+ 346: Binoculars
365
+ 347: Asparagus
366
+ 348: Barbell
367
+ 349: Scallop
368
+ 350: Noddles
369
+ 351: Comb
370
+ 352: Dumpling
371
+ 353: Oyster
372
+ 354: Table Tennis paddle
373
+ 355: Cosmetics Brush/Eyeliner Pencil
374
+ 356: Chainsaw
375
+ 357: Eraser
376
+ 358: Lobster
377
+ 359: Durian
378
+ 360: Okra
379
+ 361: Lipstick
380
+ 362: Cosmetics Mirror
381
+ 363: Curling
382
+ 364: Table Tennis
383
+
384
+
385
+ # Download script/URL (optional) ---------------------------------------------------------------------------------------
386
+ download: |
387
+ from tqdm import tqdm
388
+
389
+ from utils.general import Path, check_requirements, download, np, xyxy2xywhn
390
+
391
+ check_requirements(('pycocotools>=2.0',))
392
+ from pycocotools.coco import COCO
393
+
394
+ # Make Directories
395
+ dir = Path(yaml['path']) # dataset root dir
396
+ for p in 'images', 'labels':
397
+ (dir / p).mkdir(parents=True, exist_ok=True)
398
+ for q in 'train', 'val':
399
+ (dir / p / q).mkdir(parents=True, exist_ok=True)
400
+
401
+ # Train, Val Splits
402
+ for split, patches in [('train', 50 + 1), ('val', 43 + 1)]:
403
+ print(f"Processing {split} in {patches} patches ...")
404
+ images, labels = dir / 'images' / split, dir / 'labels' / split
405
+
406
+ # Download
407
+ url = f"https://dorc.ks3-cn-beijing.ksyun.com/data-set/2020Objects365%E6%95%B0%E6%8D%AE%E9%9B%86/{split}/"
408
+ if split == 'train':
409
+ download([f'{url}zhiyuan_objv2_{split}.tar.gz'], dir=dir, delete=False) # annotations json
410
+ download([f'{url}patch{i}.tar.gz' for i in range(patches)], dir=images, curl=True, delete=False, threads=8)
411
+ elif split == 'val':
412
+ download([f'{url}zhiyuan_objv2_{split}.json'], dir=dir, delete=False) # annotations json
413
+ download([f'{url}images/v1/patch{i}.tar.gz' for i in range(15 + 1)], dir=images, curl=True, delete=False, threads=8)
414
+ download([f'{url}images/v2/patch{i}.tar.gz' for i in range(16, patches)], dir=images, curl=True, delete=False, threads=8)
415
+
416
+ # Move
417
+ for f in tqdm(images.rglob('*.jpg'), desc=f'Moving {split} images'):
418
+ f.rename(images / f.name) # move to /images/{split}
419
+
420
+ # Labels
421
+ coco = COCO(dir / f'zhiyuan_objv2_{split}.json')
422
+ names = [x["name"] for x in coco.loadCats(coco.getCatIds())]
423
+ for cid, cat in enumerate(names):
424
+ catIds = coco.getCatIds(catNms=[cat])
425
+ imgIds = coco.getImgIds(catIds=catIds)
426
+ for im in tqdm(coco.loadImgs(imgIds), desc=f'Class {cid + 1}/{len(names)} {cat}'):
427
+ width, height = im["width"], im["height"]
428
+ path = Path(im["file_name"]) # image filename
429
+ try:
430
+ with open(labels / path.with_suffix('.txt').name, 'a') as file:
431
+ annIds = coco.getAnnIds(imgIds=im["id"], catIds=catIds, iscrowd=None)
432
+ for a in coco.loadAnns(annIds):
433
+ x, y, w, h = a['bbox'] # bounding box in xywh (xy top-left corner)
434
+ xyxy = np.array([x, y, x + w, y + h])[None] # pixels(1,4)
435
+ x, y, w, h = xyxy2xywhn(xyxy, w=width, h=height, clip=True)[0] # normalized and clipped
436
+ file.write(f"{cid} {x:.5f} {y:.5f} {w:.5f} {h:.5f}\n")
437
+ except Exception as e:
438
+ print(e)
data/SKU-110K.yaml ADDED
@@ -0,0 +1,53 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
2
+ # SKU-110K retail items dataset https://github.com/eg4000/SKU110K_CVPR19 by Trax Retail
3
+ # Example usage: python train.py --data SKU-110K.yaml
4
+ # parent
5
+ # ├── yolov5
6
+ # └── datasets
7
+ # └── SKU-110K ← downloads here (13.6 GB)
8
+
9
+
10
+ # Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
11
+ path: ../datasets/SKU-110K # dataset root dir
12
+ train: train.txt # train images (relative to 'path') 8219 images
13
+ val: val.txt # val images (relative to 'path') 588 images
14
+ test: test.txt # test images (optional) 2936 images
15
+
16
+ # Classes
17
+ names:
18
+ 0: object
19
+
20
+
21
+ # Download script/URL (optional) ---------------------------------------------------------------------------------------
22
+ download: |
23
+ import shutil
24
+ from tqdm import tqdm
25
+ from utils.general import np, pd, Path, download, xyxy2xywh
26
+
27
+
28
+ # Download
29
+ dir = Path(yaml['path']) # dataset root dir
30
+ parent = Path(dir.parent) # download dir
31
+ urls = ['http://trax-geometry.s3.amazonaws.com/cvpr_challenge/SKU110K_fixed.tar.gz']
32
+ download(urls, dir=parent, delete=False)
33
+
34
+ # Rename directories
35
+ if dir.exists():
36
+ shutil.rmtree(dir)
37
+ (parent / 'SKU110K_fixed').rename(dir) # rename dir
38
+ (dir / 'labels').mkdir(parents=True, exist_ok=True) # create labels dir
39
+
40
+ # Convert labels
41
+ names = 'image', 'x1', 'y1', 'x2', 'y2', 'class', 'image_width', 'image_height' # column names
42
+ for d in 'annotations_train.csv', 'annotations_val.csv', 'annotations_test.csv':
43
+ x = pd.read_csv(dir / 'annotations' / d, names=names).values # annotations
44
+ images, unique_images = x[:, 0], np.unique(x[:, 0])
45
+ with open((dir / d).with_suffix('.txt').__str__().replace('annotations_', ''), 'w') as f:
46
+ f.writelines(f'./images/{s}\n' for s in unique_images)
47
+ for im in tqdm(unique_images, desc=f'Converting {dir / d}'):
48
+ cls = 0 # single-class dataset
49
+ with open((dir / 'labels' / im).with_suffix('.txt'), 'a') as f:
50
+ for r in x[images == im]:
51
+ w, h = r[6], r[7] # image width, height
52
+ xywh = xyxy2xywh(np.array([[r[1] / w, r[2] / h, r[3] / w, r[4] / h]]))[0] # instance
53
+ f.write(f"{cls} {xywh[0]:.5f} {xywh[1]:.5f} {xywh[2]:.5f} {xywh[3]:.5f}\n") # write label
data/VOC.yaml ADDED
@@ -0,0 +1,100 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
2
+ # PASCAL VOC dataset http://host.robots.ox.ac.uk/pascal/VOC by University of Oxford
3
+ # Example usage: python train.py --data VOC.yaml
4
+ # parent
5
+ # ├── yolov5
6
+ # └── datasets
7
+ # └── VOC ← downloads here (2.8 GB)
8
+
9
+
10
+ # Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
11
+ path: ../datasets/VOC
12
+ train: # train images (relative to 'path') 16551 images
13
+ - images/train2012
14
+ - images/train2007
15
+ - images/val2012
16
+ - images/val2007
17
+ val: # val images (relative to 'path') 4952 images
18
+ - images/test2007
19
+ test: # test images (optional)
20
+ - images/test2007
21
+
22
+ # Classes
23
+ names:
24
+ 0: aeroplane
25
+ 1: bicycle
26
+ 2: bird
27
+ 3: boat
28
+ 4: bottle
29
+ 5: bus
30
+ 6: car
31
+ 7: cat
32
+ 8: chair
33
+ 9: cow
34
+ 10: diningtable
35
+ 11: dog
36
+ 12: horse
37
+ 13: motorbike
38
+ 14: person
39
+ 15: pottedplant
40
+ 16: sheep
41
+ 17: sofa
42
+ 18: train
43
+ 19: tvmonitor
44
+
45
+
46
+ # Download script/URL (optional) ---------------------------------------------------------------------------------------
47
+ download: |
48
+ import xml.etree.ElementTree as ET
49
+
50
+ from tqdm import tqdm
51
+ from utils.general import download, Path
52
+
53
+
54
+ def convert_label(path, lb_path, year, image_id):
55
+ def convert_box(size, box):
56
+ dw, dh = 1. / size[0], 1. / size[1]
57
+ x, y, w, h = (box[0] + box[1]) / 2.0 - 1, (box[2] + box[3]) / 2.0 - 1, box[1] - box[0], box[3] - box[2]
58
+ return x * dw, y * dh, w * dw, h * dh
59
+
60
+ in_file = open(path / f'VOC{year}/Annotations/{image_id}.xml')
61
+ out_file = open(lb_path, 'w')
62
+ tree = ET.parse(in_file)
63
+ root = tree.getroot()
64
+ size = root.find('size')
65
+ w = int(size.find('width').text)
66
+ h = int(size.find('height').text)
67
+
68
+ names = list(yaml['names'].values()) # names list
69
+ for obj in root.iter('object'):
70
+ cls = obj.find('name').text
71
+ if cls in names and int(obj.find('difficult').text) != 1:
72
+ xmlbox = obj.find('bndbox')
73
+ bb = convert_box((w, h), [float(xmlbox.find(x).text) for x in ('xmin', 'xmax', 'ymin', 'ymax')])
74
+ cls_id = names.index(cls) # class id
75
+ out_file.write(" ".join([str(a) for a in (cls_id, *bb)]) + '\n')
76
+
77
+
78
+ # Download
79
+ dir = Path(yaml['path']) # dataset root dir
80
+ url = 'https://github.com/ultralytics/yolov5/releases/download/v1.0/'
81
+ urls = [f'{url}VOCtrainval_06-Nov-2007.zip', # 446MB, 5012 images
82
+ f'{url}VOCtest_06-Nov-2007.zip', # 438MB, 4953 images
83
+ f'{url}VOCtrainval_11-May-2012.zip'] # 1.95GB, 17126 images
84
+ download(urls, dir=dir / 'images', delete=False, curl=True, threads=3)
85
+
86
+ # Convert
87
+ path = dir / 'images/VOCdevkit'
88
+ for year, image_set in ('2012', 'train'), ('2012', 'val'), ('2007', 'train'), ('2007', 'val'), ('2007', 'test'):
89
+ imgs_path = dir / 'images' / f'{image_set}{year}'
90
+ lbs_path = dir / 'labels' / f'{image_set}{year}'
91
+ imgs_path.mkdir(exist_ok=True, parents=True)
92
+ lbs_path.mkdir(exist_ok=True, parents=True)
93
+
94
+ with open(path / f'VOC{year}/ImageSets/Main/{image_set}.txt') as f:
95
+ image_ids = f.read().strip().split()
96
+ for id in tqdm(image_ids, desc=f'{image_set}{year}'):
97
+ f = path / f'VOC{year}/JPEGImages/{id}.jpg' # old img path
98
+ lb_path = (lbs_path / f.name).with_suffix('.txt') # new label path
99
+ f.rename(imgs_path / f.name) # move image
100
+ convert_label(path, lb_path, year, id) # convert labels to YOLO format
data/VisDrone.yaml ADDED
@@ -0,0 +1,70 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
2
+ # VisDrone2019-DET dataset https://github.com/VisDrone/VisDrone-Dataset by Tianjin University
3
+ # Example usage: python train.py --data VisDrone.yaml
4
+ # parent
5
+ # ├── yolov5
6
+ # └── datasets
7
+ # └── VisDrone ← downloads here (2.3 GB)
8
+
9
+
10
+ # Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
11
+ path: ../datasets/VisDrone # dataset root dir
12
+ train: VisDrone2019-DET-train/images # train images (relative to 'path') 6471 images
13
+ val: VisDrone2019-DET-val/images # val images (relative to 'path') 548 images
14
+ test: VisDrone2019-DET-test-dev/images # test images (optional) 1610 images
15
+
16
+ # Classes
17
+ names:
18
+ 0: pedestrian
19
+ 1: people
20
+ 2: bicycle
21
+ 3: car
22
+ 4: van
23
+ 5: truck
24
+ 6: tricycle
25
+ 7: awning-tricycle
26
+ 8: bus
27
+ 9: motor
28
+
29
+
30
+ # Download script/URL (optional) ---------------------------------------------------------------------------------------
31
+ download: |
32
+ from utils.general import download, os, Path
33
+
34
+ def visdrone2yolo(dir):
35
+ from PIL import Image
36
+ from tqdm import tqdm
37
+
38
+ def convert_box(size, box):
39
+ # Convert VisDrone box to YOLO xywh box
40
+ dw = 1. / size[0]
41
+ dh = 1. / size[1]
42
+ return (box[0] + box[2] / 2) * dw, (box[1] + box[3] / 2) * dh, box[2] * dw, box[3] * dh
43
+
44
+ (dir / 'labels').mkdir(parents=True, exist_ok=True) # make labels directory
45
+ pbar = tqdm((dir / 'annotations').glob('*.txt'), desc=f'Converting {dir}')
46
+ for f in pbar:
47
+ img_size = Image.open((dir / 'images' / f.name).with_suffix('.jpg')).size
48
+ lines = []
49
+ with open(f, 'r') as file: # read annotation.txt
50
+ for row in [x.split(',') for x in file.read().strip().splitlines()]:
51
+ if row[4] == '0': # VisDrone 'ignored regions' class 0
52
+ continue
53
+ cls = int(row[5]) - 1
54
+ box = convert_box(img_size, tuple(map(int, row[:4])))
55
+ lines.append(f"{cls} {' '.join(f'{x:.6f}' for x in box)}\n")
56
+ with open(str(f).replace(os.sep + 'annotations' + os.sep, os.sep + 'labels' + os.sep), 'w') as fl:
57
+ fl.writelines(lines) # write label.txt
58
+
59
+
60
+ # Download
61
+ dir = Path(yaml['path']) # dataset root dir
62
+ urls = ['https://github.com/ultralytics/yolov5/releases/download/v1.0/VisDrone2019-DET-train.zip',
63
+ 'https://github.com/ultralytics/yolov5/releases/download/v1.0/VisDrone2019-DET-val.zip',
64
+ 'https://github.com/ultralytics/yolov5/releases/download/v1.0/VisDrone2019-DET-test-dev.zip',
65
+ 'https://github.com/ultralytics/yolov5/releases/download/v1.0/VisDrone2019-DET-test-challenge.zip']
66
+ download(urls, dir=dir, curl=True, threads=4)
67
+
68
+ # Convert
69
+ for d in 'VisDrone2019-DET-train', 'VisDrone2019-DET-val', 'VisDrone2019-DET-test-dev':
70
+ visdrone2yolo(dir / d) # convert VisDrone annotations to YOLO labels
data/coco.yaml ADDED
@@ -0,0 +1,116 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
2
+ # COCO 2017 dataset http://cocodataset.org by Microsoft
3
+ # Example usage: python train.py --data coco.yaml
4
+ # parent
5
+ # ├── yolov5
6
+ # └── datasets
7
+ # └── coco ← downloads here (20.1 GB)
8
+
9
+
10
+ # Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
11
+ path: ../datasets/coco # dataset root dir
12
+ train: train2017.txt # train images (relative to 'path') 118287 images
13
+ val: val2017.txt # val images (relative to 'path') 5000 images
14
+ test: test-dev2017.txt # 20288 of 40670 images, submit to https://competitions.codalab.org/competitions/20794
15
+
16
+ # Classes
17
+ names:
18
+ 0: person
19
+ 1: bicycle
20
+ 2: car
21
+ 3: motorcycle
22
+ 4: airplane
23
+ 5: bus
24
+ 6: train
25
+ 7: truck
26
+ 8: boat
27
+ 9: traffic light
28
+ 10: fire hydrant
29
+ 11: stop sign
30
+ 12: parking meter
31
+ 13: bench
32
+ 14: bird
33
+ 15: cat
34
+ 16: dog
35
+ 17: horse
36
+ 18: sheep
37
+ 19: cow
38
+ 20: elephant
39
+ 21: bear
40
+ 22: zebra
41
+ 23: giraffe
42
+ 24: backpack
43
+ 25: umbrella
44
+ 26: handbag
45
+ 27: tie
46
+ 28: suitcase
47
+ 29: frisbee
48
+ 30: skis
49
+ 31: snowboard
50
+ 32: sports ball
51
+ 33: kite
52
+ 34: baseball bat
53
+ 35: baseball glove
54
+ 36: skateboard
55
+ 37: surfboard
56
+ 38: tennis racket
57
+ 39: bottle
58
+ 40: wine glass
59
+ 41: cup
60
+ 42: fork
61
+ 43: knife
62
+ 44: spoon
63
+ 45: bowl
64
+ 46: banana
65
+ 47: apple
66
+ 48: sandwich
67
+ 49: orange
68
+ 50: broccoli
69
+ 51: carrot
70
+ 52: hot dog
71
+ 53: pizza
72
+ 54: donut
73
+ 55: cake
74
+ 56: chair
75
+ 57: couch
76
+ 58: potted plant
77
+ 59: bed
78
+ 60: dining table
79
+ 61: toilet
80
+ 62: tv
81
+ 63: laptop
82
+ 64: mouse
83
+ 65: remote
84
+ 66: keyboard
85
+ 67: cell phone
86
+ 68: microwave
87
+ 69: oven
88
+ 70: toaster
89
+ 71: sink
90
+ 72: refrigerator
91
+ 73: book
92
+ 74: clock
93
+ 75: vase
94
+ 76: scissors
95
+ 77: teddy bear
96
+ 78: hair drier
97
+ 79: toothbrush
98
+
99
+
100
+ # Download script/URL (optional)
101
+ download: |
102
+ from utils.general import download, Path
103
+
104
+
105
+ # Download labels
106
+ segments = False # segment or box labels
107
+ dir = Path(yaml['path']) # dataset root dir
108
+ url = 'https://github.com/ultralytics/yolov5/releases/download/v1.0/'
109
+ urls = [url + ('coco2017labels-segments.zip' if segments else 'coco2017labels.zip')] # labels
110
+ download(urls, dir=dir.parent)
111
+
112
+ # Download data
113
+ urls = ['http://images.cocodataset.org/zips/train2017.zip', # 19G, 118k images
114
+ 'http://images.cocodataset.org/zips/val2017.zip', # 1G, 5k images
115
+ 'http://images.cocodataset.org/zips/test2017.zip'] # 7G, 41k images (optional)
116
+ download(urls, dir=dir / 'images', threads=3)
data/coco128-seg.yaml ADDED
@@ -0,0 +1,101 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
2
+ # COCO128-seg dataset https://www.kaggle.com/ultralytics/coco128 (first 128 images from COCO train2017) by Ultralytics
3
+ # Example usage: python train.py --data coco128.yaml
4
+ # parent
5
+ # ├── yolov5
6
+ # └── datasets
7
+ # └── coco128-seg ← downloads here (7 MB)
8
+
9
+
10
+ # Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
11
+ path: ../datasets/coco128-seg # dataset root dir
12
+ train: images/train2017 # train images (relative to 'path') 128 images
13
+ val: images/train2017 # val images (relative to 'path') 128 images
14
+ test: # test images (optional)
15
+
16
+ # Classes
17
+ names:
18
+ 0: person
19
+ 1: bicycle
20
+ 2: car
21
+ 3: motorcycle
22
+ 4: airplane
23
+ 5: bus
24
+ 6: train
25
+ 7: truck
26
+ 8: boat
27
+ 9: traffic light
28
+ 10: fire hydrant
29
+ 11: stop sign
30
+ 12: parking meter
31
+ 13: bench
32
+ 14: bird
33
+ 15: cat
34
+ 16: dog
35
+ 17: horse
36
+ 18: sheep
37
+ 19: cow
38
+ 20: elephant
39
+ 21: bear
40
+ 22: zebra
41
+ 23: giraffe
42
+ 24: backpack
43
+ 25: umbrella
44
+ 26: handbag
45
+ 27: tie
46
+ 28: suitcase
47
+ 29: frisbee
48
+ 30: skis
49
+ 31: snowboard
50
+ 32: sports ball
51
+ 33: kite
52
+ 34: baseball bat
53
+ 35: baseball glove
54
+ 36: skateboard
55
+ 37: surfboard
56
+ 38: tennis racket
57
+ 39: bottle
58
+ 40: wine glass
59
+ 41: cup
60
+ 42: fork
61
+ 43: knife
62
+ 44: spoon
63
+ 45: bowl
64
+ 46: banana
65
+ 47: apple
66
+ 48: sandwich
67
+ 49: orange
68
+ 50: broccoli
69
+ 51: carrot
70
+ 52: hot dog
71
+ 53: pizza
72
+ 54: donut
73
+ 55: cake
74
+ 56: chair
75
+ 57: couch
76
+ 58: potted plant
77
+ 59: bed
78
+ 60: dining table
79
+ 61: toilet
80
+ 62: tv
81
+ 63: laptop
82
+ 64: mouse
83
+ 65: remote
84
+ 66: keyboard
85
+ 67: cell phone
86
+ 68: microwave
87
+ 69: oven
88
+ 70: toaster
89
+ 71: sink
90
+ 72: refrigerator
91
+ 73: book
92
+ 74: clock
93
+ 75: vase
94
+ 76: scissors
95
+ 77: teddy bear
96
+ 78: hair drier
97
+ 79: toothbrush
98
+
99
+
100
+ # Download script/URL (optional)
101
+ download: https://ultralytics.com/assets/coco128-seg.zip
data/coco128.yaml ADDED
@@ -0,0 +1,101 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
2
+ # COCO128 dataset https://www.kaggle.com/ultralytics/coco128 (first 128 images from COCO train2017) by Ultralytics
3
+ # Example usage: python train.py --data coco128.yaml
4
+ # parent
5
+ # ├── yolov5
6
+ # └── datasets
7
+ # └── coco128 ← downloads here (7 MB)
8
+
9
+
10
+ # Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
11
+ path: ../datasets/coco128 # dataset root dir
12
+ train: images/train2017 # train images (relative to 'path') 128 images
13
+ val: images/train2017 # val images (relative to 'path') 128 images
14
+ test: # test images (optional)
15
+
16
+ # Classes
17
+ names:
18
+ 0: person
19
+ 1: bicycle
20
+ 2: car
21
+ 3: motorcycle
22
+ 4: airplane
23
+ 5: bus
24
+ 6: train
25
+ 7: truck
26
+ 8: boat
27
+ 9: traffic light
28
+ 10: fire hydrant
29
+ 11: stop sign
30
+ 12: parking meter
31
+ 13: bench
32
+ 14: bird
33
+ 15: cat
34
+ 16: dog
35
+ 17: horse
36
+ 18: sheep
37
+ 19: cow
38
+ 20: elephant
39
+ 21: bear
40
+ 22: zebra
41
+ 23: giraffe
42
+ 24: backpack
43
+ 25: umbrella
44
+ 26: handbag
45
+ 27: tie
46
+ 28: suitcase
47
+ 29: frisbee
48
+ 30: skis
49
+ 31: snowboard
50
+ 32: sports ball
51
+ 33: kite
52
+ 34: baseball bat
53
+ 35: baseball glove
54
+ 36: skateboard
55
+ 37: surfboard
56
+ 38: tennis racket
57
+ 39: bottle
58
+ 40: wine glass
59
+ 41: cup
60
+ 42: fork
61
+ 43: knife
62
+ 44: spoon
63
+ 45: bowl
64
+ 46: banana
65
+ 47: apple
66
+ 48: sandwich
67
+ 49: orange
68
+ 50: broccoli
69
+ 51: carrot
70
+ 52: hot dog
71
+ 53: pizza
72
+ 54: donut
73
+ 55: cake
74
+ 56: chair
75
+ 57: couch
76
+ 58: potted plant
77
+ 59: bed
78
+ 60: dining table
79
+ 61: toilet
80
+ 62: tv
81
+ 63: laptop
82
+ 64: mouse
83
+ 65: remote
84
+ 66: keyboard
85
+ 67: cell phone
86
+ 68: microwave
87
+ 69: oven
88
+ 70: toaster
89
+ 71: sink
90
+ 72: refrigerator
91
+ 73: book
92
+ 74: clock
93
+ 75: vase
94
+ 76: scissors
95
+ 77: teddy bear
96
+ 78: hair drier
97
+ 79: toothbrush
98
+
99
+
100
+ # Download script/URL (optional)
101
+ download: https://ultralytics.com/assets/coco128.zip
data/hyps/hyp.Objects365.yaml ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
2
+ # Hyperparameters for Objects365 training
3
+ # python train.py --weights yolov5m.pt --data Objects365.yaml --evolve
4
+ # See Hyperparameter Evolution tutorial for details https://github.com/ultralytics/yolov5#tutorials
5
+
6
+ lr0: 0.00258
7
+ lrf: 0.17
8
+ momentum: 0.779
9
+ weight_decay: 0.00058
10
+ warmup_epochs: 1.33
11
+ warmup_momentum: 0.86
12
+ warmup_bias_lr: 0.0711
13
+ box: 0.0539
14
+ cls: 0.299
15
+ cls_pw: 0.825
16
+ obj: 0.632
17
+ obj_pw: 1.0
18
+ iou_t: 0.2
19
+ anchor_t: 3.44
20
+ anchors: 3.2
21
+ fl_gamma: 0.0
22
+ hsv_h: 0.0188
23
+ hsv_s: 0.704
24
+ hsv_v: 0.36
25
+ degrees: 0.0
26
+ translate: 0.0902
27
+ scale: 0.491
28
+ shear: 0.0
29
+ perspective: 0.0
30
+ flipud: 0.0
31
+ fliplr: 0.5
32
+ mosaic: 1.0
33
+ mixup: 0.0
34
+ copy_paste: 0.0
data/hyps/hyp.VOC.yaml ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
2
+ # Hyperparameters for VOC training
3
+ # python train.py --batch 128 --weights yolov5m6.pt --data VOC.yaml --epochs 50 --img 512 --hyp hyp.scratch-med.yaml --evolve
4
+ # See Hyperparameter Evolution tutorial for details https://github.com/ultralytics/yolov5#tutorials
5
+
6
+ # YOLOv5 Hyperparameter Evolution Results
7
+ # Best generation: 467
8
+ # Last generation: 996
9
+ # metrics/precision, metrics/recall, metrics/mAP_0.5, metrics/mAP_0.5:0.95, val/box_loss, val/obj_loss, val/cls_loss
10
+ # 0.87729, 0.85125, 0.91286, 0.72664, 0.0076739, 0.0042529, 0.0013865
11
+
12
+ lr0: 0.00334
13
+ lrf: 0.15135
14
+ momentum: 0.74832
15
+ weight_decay: 0.00025
16
+ warmup_epochs: 3.3835
17
+ warmup_momentum: 0.59462
18
+ warmup_bias_lr: 0.18657
19
+ box: 0.02
20
+ cls: 0.21638
21
+ cls_pw: 0.5
22
+ obj: 0.51728
23
+ obj_pw: 0.67198
24
+ iou_t: 0.2
25
+ anchor_t: 3.3744
26
+ fl_gamma: 0.0
27
+ hsv_h: 0.01041
28
+ hsv_s: 0.54703
29
+ hsv_v: 0.27739
30
+ degrees: 0.0
31
+ translate: 0.04591
32
+ scale: 0.75544
33
+ shear: 0.0
34
+ perspective: 0.0
35
+ flipud: 0.0
36
+ fliplr: 0.5
37
+ mosaic: 0.85834
38
+ mixup: 0.04266
39
+ copy_paste: 0.0
40
+ anchors: 3.412
data/hyps/hyp.no-augmentation.yaml ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
2
+ # Hyperparameters when using Albumentations frameworks
3
+ # python train.py --hyp hyp.no-augmentation.yaml
4
+ # See https://github.com/ultralytics/yolov5/pull/3882 for YOLOv5 + Albumentations Usage examples
5
+
6
+ lr0: 0.01 # initial learning rate (SGD=1E-2, Adam=1E-3)
7
+ lrf: 0.1 # final OneCycleLR learning rate (lr0 * lrf)
8
+ momentum: 0.937 # SGD momentum/Adam beta1
9
+ weight_decay: 0.0005 # optimizer weight decay 5e-4
10
+ warmup_epochs: 3.0 # warmup epochs (fractions ok)
11
+ warmup_momentum: 0.8 # warmup initial momentum
12
+ warmup_bias_lr: 0.1 # warmup initial bias lr
13
+ box: 0.05 # box loss gain
14
+ cls: 0.3 # cls loss gain
15
+ cls_pw: 1.0 # cls BCELoss positive_weight
16
+ obj: 0.7 # obj loss gain (scale with pixels)
17
+ obj_pw: 1.0 # obj BCELoss positive_weight
18
+ iou_t: 0.20 # IoU training threshold
19
+ anchor_t: 4.0 # anchor-multiple threshold
20
+ # anchors: 3 # anchors per output layer (0 to ignore)
21
+ # this parameters are all zero since we want to use albumentation framework
22
+ fl_gamma: 0.0 # focal loss gamma (efficientDet default gamma=1.5)
23
+ hsv_h: 0 # image HSV-Hue augmentation (fraction)
24
+ hsv_s: 00 # image HSV-Saturation augmentation (fraction)
25
+ hsv_v: 0 # image HSV-Value augmentation (fraction)
26
+ degrees: 0.0 # image rotation (+/- deg)
27
+ translate: 0 # image translation (+/- fraction)
28
+ scale: 0 # image scale (+/- gain)
29
+ shear: 0 # image shear (+/- deg)
30
+ perspective: 0.0 # image perspective (+/- fraction), range 0-0.001
31
+ flipud: 0.0 # image flip up-down (probability)
32
+ fliplr: 0.0 # image flip left-right (probability)
33
+ mosaic: 0.0 # image mosaic (probability)
34
+ mixup: 0.0 # image mixup (probability)
35
+ copy_paste: 0.0 # segment copy-paste (probability)
data/hyps/hyp.scratch-high.yaml ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
2
+ # Hyperparameters for high-augmentation COCO training from scratch
3
+ # python train.py --batch 32 --cfg yolov5m6.yaml --weights '' --data coco.yaml --img 1280 --epochs 300
4
+ # See tutorials for hyperparameter evolution https://github.com/ultralytics/yolov5#tutorials
5
+
6
+ lr0: 0.01 # initial learning rate (SGD=1E-2, Adam=1E-3)
7
+ lrf: 0.1 # final OneCycleLR learning rate (lr0 * lrf)
8
+ momentum: 0.937 # SGD momentum/Adam beta1
9
+ weight_decay: 0.0005 # optimizer weight decay 5e-4
10
+ warmup_epochs: 3.0 # warmup epochs (fractions ok)
11
+ warmup_momentum: 0.8 # warmup initial momentum
12
+ warmup_bias_lr: 0.1 # warmup initial bias lr
13
+ box: 0.05 # box loss gain
14
+ cls: 0.3 # cls loss gain
15
+ cls_pw: 1.0 # cls BCELoss positive_weight
16
+ obj: 0.7 # obj loss gain (scale with pixels)
17
+ obj_pw: 1.0 # obj BCELoss positive_weight
18
+ iou_t: 0.20 # IoU training threshold
19
+ anchor_t: 4.0 # anchor-multiple threshold
20
+ # anchors: 3 # anchors per output layer (0 to ignore)
21
+ fl_gamma: 0.0 # focal loss gamma (efficientDet default gamma=1.5)
22
+ hsv_h: 0.015 # image HSV-Hue augmentation (fraction)
23
+ hsv_s: 0.7 # image HSV-Saturation augmentation (fraction)
24
+ hsv_v: 0.4 # image HSV-Value augmentation (fraction)
25
+ degrees: 0.0 # image rotation (+/- deg)
26
+ translate: 0.1 # image translation (+/- fraction)
27
+ scale: 0.9 # image scale (+/- gain)
28
+ shear: 0.0 # image shear (+/- deg)
29
+ perspective: 0.0 # image perspective (+/- fraction), range 0-0.001
30
+ flipud: 0.0 # image flip up-down (probability)
31
+ fliplr: 0.5 # image flip left-right (probability)
32
+ mosaic: 1.0 # image mosaic (probability)
33
+ mixup: 0.1 # image mixup (probability)
34
+ copy_paste: 0.1 # segment copy-paste (probability)
data/hyps/hyp.scratch-low.yaml ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
2
+ # Hyperparameters for low-augmentation COCO training from scratch
3
+ # python train.py --batch 64 --cfg yolov5n6.yaml --weights '' --data coco.yaml --img 640 --epochs 300 --linear
4
+ # See tutorials for hyperparameter evolution https://github.com/ultralytics/yolov5#tutorials
5
+
6
+ lr0: 0.01 # initial learning rate (SGD=1E-2, Adam=1E-3)
7
+ lrf: 0.01 # final OneCycleLR learning rate (lr0 * lrf)
8
+ momentum: 0.937 # SGD momentum/Adam beta1
9
+ weight_decay: 0.0005 # optimizer weight decay 5e-4
10
+ warmup_epochs: 3.0 # warmup epochs (fractions ok)
11
+ warmup_momentum: 0.8 # warmup initial momentum
12
+ warmup_bias_lr: 0.1 # warmup initial bias lr
13
+ box: 0.05 # box loss gain
14
+ cls: 0.5 # cls loss gain
15
+ cls_pw: 1.0 # cls BCELoss positive_weight
16
+ obj: 1.0 # obj loss gain (scale with pixels)
17
+ obj_pw: 1.0 # obj BCELoss positive_weight
18
+ iou_t: 0.20 # IoU training threshold
19
+ anchor_t: 4.0 # anchor-multiple threshold
20
+ # anchors: 3 # anchors per output layer (0 to ignore)
21
+ fl_gamma: 0.0 # focal loss gamma (efficientDet default gamma=1.5)
22
+ hsv_h: 0.015 # image HSV-Hue augmentation (fraction)
23
+ hsv_s: 0.7 # image HSV-Saturation augmentation (fraction)
24
+ hsv_v: 0.4 # image HSV-Value augmentation (fraction)
25
+ degrees: 0.0 # image rotation (+/- deg)
26
+ translate: 0.1 # image translation (+/- fraction)
27
+ scale: 0.5 # image scale (+/- gain)
28
+ shear: 0.0 # image shear (+/- deg)
29
+ perspective: 0.0 # image perspective (+/- fraction), range 0-0.001
30
+ flipud: 0.0 # image flip up-down (probability)
31
+ fliplr: 0.5 # image flip left-right (probability)
32
+ mosaic: 1.0 # image mosaic (probability)
33
+ mixup: 0.0 # image mixup (probability)
34
+ copy_paste: 0.0 # segment copy-paste (probability)
data/hyps/hyp.scratch-med.yaml ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
2
+ # Hyperparameters for medium-augmentation COCO training from scratch
3
+ # python train.py --batch 32 --cfg yolov5m6.yaml --weights '' --data coco.yaml --img 1280 --epochs 300
4
+ # See tutorials for hyperparameter evolution https://github.com/ultralytics/yolov5#tutorials
5
+
6
+ lr0: 0.01 # initial learning rate (SGD=1E-2, Adam=1E-3)
7
+ lrf: 0.1 # final OneCycleLR learning rate (lr0 * lrf)
8
+ momentum: 0.937 # SGD momentum/Adam beta1
9
+ weight_decay: 0.0005 # optimizer weight decay 5e-4
10
+ warmup_epochs: 3.0 # warmup epochs (fractions ok)
11
+ warmup_momentum: 0.8 # warmup initial momentum
12
+ warmup_bias_lr: 0.1 # warmup initial bias lr
13
+ box: 0.05 # box loss gain
14
+ cls: 0.3 # cls loss gain
15
+ cls_pw: 1.0 # cls BCELoss positive_weight
16
+ obj: 0.7 # obj loss gain (scale with pixels)
17
+ obj_pw: 1.0 # obj BCELoss positive_weight
18
+ iou_t: 0.20 # IoU training threshold
19
+ anchor_t: 4.0 # anchor-multiple threshold
20
+ # anchors: 3 # anchors per output layer (0 to ignore)
21
+ fl_gamma: 0.0 # focal loss gamma (efficientDet default gamma=1.5)
22
+ hsv_h: 0.015 # image HSV-Hue augmentation (fraction)
23
+ hsv_s: 0.7 # image HSV-Saturation augmentation (fraction)
24
+ hsv_v: 0.4 # image HSV-Value augmentation (fraction)
25
+ degrees: 0.0 # image rotation (+/- deg)
26
+ translate: 0.1 # image translation (+/- fraction)
27
+ scale: 0.9 # image scale (+/- gain)
28
+ shear: 0.0 # image shear (+/- deg)
29
+ perspective: 0.0 # image perspective (+/- fraction), range 0-0.001
30
+ flipud: 0.0 # image flip up-down (probability)
31
+ fliplr: 0.5 # image flip left-right (probability)
32
+ mosaic: 1.0 # image mosaic (probability)
33
+ mixup: 0.1 # image mixup (probability)
34
+ copy_paste: 0.0 # segment copy-paste (probability)
data/images/bus.jpg ADDED
data/images/zidane.jpg ADDED
data/scripts/download_weights.sh ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/bin/bash
2
+ # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
3
+ # Download latest models from https://github.com/ultralytics/yolov5/releases
4
+ # Example usage: bash data/scripts/download_weights.sh
5
+ # parent
6
+ # └── yolov5
7
+ # ├── yolov5s.pt ← downloads here
8
+ # ├── yolov5m.pt
9
+ # └── ...
10
+
11
+ python - <<EOF
12
+ from utils.downloads import attempt_download
13
+
14
+ p5 = list('nsmlx') # P5 models
15
+ p6 = [f'{x}6' for x in p5] # P6 models
16
+ cls = [f'{x}-cls' for x in p5] # classification models
17
+ seg = [f'{x}-seg' for x in p5] # classification models
18
+
19
+ for x in p5 + p6 + cls + seg:
20
+ attempt_download(f'weights/yolov5{x}.pt')
21
+
22
+ EOF
data/scripts/get_coco.sh ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/bin/bash
2
+ # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
3
+ # Download COCO 2017 dataset http://cocodataset.org
4
+ # Example usage: bash data/scripts/get_coco.sh
5
+ # parent
6
+ # ├── yolov5
7
+ # └── datasets
8
+ # └── coco ← downloads here
9
+
10
+ # Arguments (optional) Usage: bash data/scripts/get_coco.sh --train --val --test --segments
11
+ if [ "$#" -gt 0 ]; then
12
+ for opt in "$@"; do
13
+ case "${opt}" in
14
+ --train) train=true ;;
15
+ --val) val=true ;;
16
+ --test) test=true ;;
17
+ --segments) segments=true ;;
18
+ esac
19
+ done
20
+ else
21
+ train=true
22
+ val=true
23
+ test=false
24
+ segments=false
25
+ fi
26
+
27
+ # Download/unzip labels
28
+ d='../datasets' # unzip directory
29
+ url=https://github.com/ultralytics/yolov5/releases/download/v1.0/
30
+ if [ "$segments" == "true" ]; then
31
+ f='coco2017labels-segments.zip' # 168 MB
32
+ else
33
+ f='coco2017labels.zip' # 46 MB
34
+ fi
35
+ echo 'Downloading' $url$f ' ...'
36
+ curl -L $url$f -o $f -# && unzip -q $f -d $d && rm $f &
37
+
38
+ # Download/unzip images
39
+ d='../datasets/coco/images' # unzip directory
40
+ url=http://images.cocodataset.org/zips/
41
+ if [ "$train" == "true" ]; then
42
+ f='train2017.zip' # 19G, 118k images
43
+ echo 'Downloading' $url$f '...'
44
+ curl -L $url$f -o $f -# && unzip -q $f -d $d && rm $f &
45
+ fi
46
+ if [ "$val" == "true" ]; then
47
+ f='val2017.zip' # 1G, 5k images
48
+ echo 'Downloading' $url$f '...'
49
+ curl -L $url$f -o $f -# && unzip -q $f -d $d && rm $f &
50
+ fi
51
+ if [ "$test" == "true" ]; then
52
+ f='test2017.zip' # 7G, 41k images (optional)
53
+ echo 'Downloading' $url$f '...'
54
+ curl -L $url$f -o $f -# && unzip -q $f -d $d && rm $f &
55
+ fi
56
+ wait # finish background tasks
data/scripts/get_coco128.sh ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/bin/bash
2
+ # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
3
+ # Download COCO128 dataset https://www.kaggle.com/ultralytics/coco128 (first 128 images from COCO train2017)
4
+ # Example usage: bash data/scripts/get_coco128.sh
5
+ # parent
6
+ # ├── yolov5
7
+ # └── datasets
8
+ # └── coco128 ← downloads here
9
+
10
+ # Download/unzip images and labels
11
+ d='../datasets' # unzip directory
12
+ url=https://github.com/ultralytics/yolov5/releases/download/v1.0/
13
+ f='coco128.zip' # or 'coco128-segments.zip', 68 MB
14
+ echo 'Downloading' $url$f ' ...'
15
+ curl -L $url$f -o $f -# && unzip -q $f -d $d && rm $f &
16
+
17
+ wait # finish background tasks
data/scripts/get_imagenet.sh ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/bin/bash
2
+ # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
3
+ # Download ILSVRC2012 ImageNet dataset https://image-net.org
4
+ # Example usage: bash data/scripts/get_imagenet.sh
5
+ # parent
6
+ # ├── yolov5
7
+ # └── datasets
8
+ # └── imagenet ← downloads here
9
+
10
+ # Arguments (optional) Usage: bash data/scripts/get_imagenet.sh --train --val
11
+ if [ "$#" -gt 0 ]; then
12
+ for opt in "$@"; do
13
+ case "${opt}" in
14
+ --train) train=true ;;
15
+ --val) val=true ;;
16
+ esac
17
+ done
18
+ else
19
+ train=true
20
+ val=true
21
+ fi
22
+
23
+ # Make dir
24
+ d='../datasets/imagenet' # unzip directory
25
+ mkdir -p $d && cd $d
26
+
27
+ # Download/unzip train
28
+ if [ "$train" == "true" ]; then
29
+ wget https://image-net.org/data/ILSVRC/2012/ILSVRC2012_img_train.tar # download 138G, 1281167 images
30
+ mkdir train && mv ILSVRC2012_img_train.tar train/ && cd train
31
+ tar -xf ILSVRC2012_img_train.tar && rm -f ILSVRC2012_img_train.tar
32
+ find . -name "*.tar" | while read NAME; do
33
+ mkdir -p "${NAME%.tar}"
34
+ tar -xf "${NAME}" -C "${NAME%.tar}"
35
+ rm -f "${NAME}"
36
+ done
37
+ cd ..
38
+ fi
39
+
40
+ # Download/unzip val
41
+ if [ "$val" == "true" ]; then
42
+ wget https://image-net.org/data/ILSVRC/2012/ILSVRC2012_img_val.tar # download 6.3G, 50000 images
43
+ mkdir val && mv ILSVRC2012_img_val.tar val/ && cd val && tar -xf ILSVRC2012_img_val.tar
44
+ wget -qO- https://raw.githubusercontent.com/soumith/imagenetloader.torch/master/valprep.sh | bash # move into subdirs
45
+ fi
46
+
47
+ # Delete corrupted image (optional: PNG under JPEG name that may cause dataloaders to fail)
48
+ # rm train/n04266014/n04266014_10835.JPEG
49
+
50
+ # TFRecords (optional)
51
+ # wget https://raw.githubusercontent.com/tensorflow/models/master/research/slim/datasets/imagenet_lsvrc_2015_synsets.txt
data/xView.yaml ADDED
@@ -0,0 +1,153 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
2
+ # DIUx xView 2018 Challenge https://challenge.xviewdataset.org by U.S. National Geospatial-Intelligence Agency (NGA)
3
+ # -------- DOWNLOAD DATA MANUALLY and jar xf val_images.zip to 'datasets/xView' before running train command! --------
4
+ # Example usage: python train.py --data xView.yaml
5
+ # parent
6
+ # ├── yolov5
7
+ # └── datasets
8
+ # └── xView ← downloads here (20.7 GB)
9
+
10
+
11
+ # Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
12
+ path: ../datasets/xView # dataset root dir
13
+ train: images/autosplit_train.txt # train images (relative to 'path') 90% of 847 train images
14
+ val: images/autosplit_val.txt # train images (relative to 'path') 10% of 847 train images
15
+
16
+ # Classes
17
+ names:
18
+ 0: Fixed-wing Aircraft
19
+ 1: Small Aircraft
20
+ 2: Cargo Plane
21
+ 3: Helicopter
22
+ 4: Passenger Vehicle
23
+ 5: Small Car
24
+ 6: Bus
25
+ 7: Pickup Truck
26
+ 8: Utility Truck
27
+ 9: Truck
28
+ 10: Cargo Truck
29
+ 11: Truck w/Box
30
+ 12: Truck Tractor
31
+ 13: Trailer
32
+ 14: Truck w/Flatbed
33
+ 15: Truck w/Liquid
34
+ 16: Crane Truck
35
+ 17: Railway Vehicle
36
+ 18: Passenger Car
37
+ 19: Cargo Car
38
+ 20: Flat Car
39
+ 21: Tank car
40
+ 22: Locomotive
41
+ 23: Maritime Vessel
42
+ 24: Motorboat
43
+ 25: Sailboat
44
+ 26: Tugboat
45
+ 27: Barge
46
+ 28: Fishing Vessel
47
+ 29: Ferry
48
+ 30: Yacht
49
+ 31: Container Ship
50
+ 32: Oil Tanker
51
+ 33: Engineering Vehicle
52
+ 34: Tower crane
53
+ 35: Container Crane
54
+ 36: Reach Stacker
55
+ 37: Straddle Carrier
56
+ 38: Mobile Crane
57
+ 39: Dump Truck
58
+ 40: Haul Truck
59
+ 41: Scraper/Tractor
60
+ 42: Front loader/Bulldozer
61
+ 43: Excavator
62
+ 44: Cement Mixer
63
+ 45: Ground Grader
64
+ 46: Hut/Tent
65
+ 47: Shed
66
+ 48: Building
67
+ 49: Aircraft Hangar
68
+ 50: Damaged Building
69
+ 51: Facility
70
+ 52: Construction Site
71
+ 53: Vehicle Lot
72
+ 54: Helipad
73
+ 55: Storage Tank
74
+ 56: Shipping container lot
75
+ 57: Shipping Container
76
+ 58: Pylon
77
+ 59: Tower
78
+
79
+
80
+ # Download script/URL (optional) ---------------------------------------------------------------------------------------
81
+ download: |
82
+ import json
83
+ import os
84
+ from pathlib import Path
85
+
86
+ import numpy as np
87
+ from PIL import Image
88
+ from tqdm import tqdm
89
+
90
+ from utils.dataloaders import autosplit
91
+ from utils.general import download, xyxy2xywhn
92
+
93
+
94
+ def convert_labels(fname=Path('xView/xView_train.geojson')):
95
+ # Convert xView geoJSON labels to YOLO format
96
+ path = fname.parent
97
+ with open(fname) as f:
98
+ print(f'Loading {fname}...')
99
+ data = json.load(f)
100
+
101
+ # Make dirs
102
+ labels = Path(path / 'labels' / 'train')
103
+ os.system(f'rm -rf {labels}')
104
+ labels.mkdir(parents=True, exist_ok=True)
105
+
106
+ # xView classes 11-94 to 0-59
107
+ xview_class2index = [-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 0, 1, 2, -1, 3, -1, 4, 5, 6, 7, 8, -1, 9, 10, 11,
108
+ 12, 13, 14, 15, -1, -1, 16, 17, 18, 19, 20, 21, 22, -1, 23, 24, 25, -1, 26, 27, -1, 28, -1,
109
+ 29, 30, 31, 32, 33, 34, 35, 36, 37, -1, 38, 39, 40, 41, 42, 43, 44, 45, -1, -1, -1, -1, 46,
110
+ 47, 48, 49, -1, 50, 51, -1, 52, -1, -1, -1, 53, 54, -1, 55, -1, -1, 56, -1, 57, -1, 58, 59]
111
+
112
+ shapes = {}
113
+ for feature in tqdm(data['features'], desc=f'Converting {fname}'):
114
+ p = feature['properties']
115
+ if p['bounds_imcoords']:
116
+ id = p['image_id']
117
+ file = path / 'train_images' / id
118
+ if file.exists(): # 1395.tif missing
119
+ try:
120
+ box = np.array([int(num) for num in p['bounds_imcoords'].split(",")])
121
+ assert box.shape[0] == 4, f'incorrect box shape {box.shape[0]}'
122
+ cls = p['type_id']
123
+ cls = xview_class2index[int(cls)] # xView class to 0-60
124
+ assert 59 >= cls >= 0, f'incorrect class index {cls}'
125
+
126
+ # Write YOLO label
127
+ if id not in shapes:
128
+ shapes[id] = Image.open(file).size
129
+ box = xyxy2xywhn(box[None].astype(np.float), w=shapes[id][0], h=shapes[id][1], clip=True)
130
+ with open((labels / id).with_suffix('.txt'), 'a') as f:
131
+ f.write(f"{cls} {' '.join(f'{x:.6f}' for x in box[0])}\n") # write label.txt
132
+ except Exception as e:
133
+ print(f'WARNING: skipping one label for {file}: {e}')
134
+
135
+
136
+ # Download manually from https://challenge.xviewdataset.org
137
+ dir = Path(yaml['path']) # dataset root dir
138
+ # urls = ['https://d307kc0mrhucc3.cloudfront.net/train_labels.zip', # train labels
139
+ # 'https://d307kc0mrhucc3.cloudfront.net/train_images.zip', # 15G, 847 train images
140
+ # 'https://d307kc0mrhucc3.cloudfront.net/val_images.zip'] # 5G, 282 val images (no labels)
141
+ # download(urls, dir=dir, delete=False)
142
+
143
+ # Convert labels
144
+ convert_labels(dir / 'xView_train.geojson')
145
+
146
+ # Move images
147
+ images = Path(dir / 'images')
148
+ images.mkdir(parents=True, exist_ok=True)
149
+ Path(dir / 'train_images').rename(dir / 'images' / 'train')
150
+ Path(dir / 'val_images').rename(dir / 'images' / 'val')
151
+
152
+ # Split
153
+ autosplit(dir / 'images' / 'train')
detect.py ADDED
@@ -0,0 +1,460 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
2
+ """
3
+ Run YOLOv5 detection inference on images, videos, directories, globs, YouTube, webcam, streams, etc.
4
+
5
+ Usage - sources:
6
+ $ python detect.py --weights yolov5s.pt --source 0 # webcam
7
+ img.jpg # image
8
+ vid.mp4 # video
9
+ screen # screenshot
10
+ path/ # directory
11
+ list.txt # list of images
12
+ list.streams # list of streams
13
+ 'path/*.jpg' # glob
14
+ 'https://youtu.be/Zgi9g1ksQHc' # YouTube
15
+ 'rtsp://example.com/media.mp4' # RTSP, RTMP, HTTP stream
16
+
17
+ Usage - formats:
18
+ $ python detect.py --weights yolov5s.pt # PyTorch
19
+ yolov5s.torchscript # TorchScript
20
+ yolov5s.onnx # ONNX Runtime or OpenCV DNN with --dnn
21
+ yolov5s_openvino_model # OpenVINO
22
+ yolov5s.engine # TensorRT
23
+ yolov5s.mlmodel # CoreML (macOS-only)
24
+ yolov5s_saved_model # TensorFlow SavedModel
25
+ yolov5s.pb # TensorFlow GraphDef
26
+ yolov5s.tflite # TensorFlow Lite
27
+ yolov5s_edgetpu.tflite # TensorFlow Edge TPU
28
+ yolov5s_paddle_model # PaddlePaddle
29
+ """
30
+
31
+ import argparse
32
+ import os
33
+ import platform
34
+ import sys
35
+ from pathlib import Path
36
+
37
+ import torch
38
+
39
+ FILE = Path(__file__).resolve()
40
+ ROOT = FILE.parents[0] # YOLOv5 root directory
41
+ if str(ROOT) not in sys.path:
42
+ sys.path.append(str(ROOT)) # add ROOT to PATH
43
+ ROOT = Path(os.path.relpath(ROOT, Path.cwd())) # relative
44
+
45
+ from models.common import DetectMultiBackend
46
+ from utils.dataloaders import (
47
+ IMG_FORMATS,
48
+ VID_FORMATS,
49
+ LoadImages,
50
+ LoadScreenshots,
51
+ LoadStreams,
52
+ )
53
+ from utils.general import (
54
+ LOGGER,
55
+ Profile,
56
+ check_file,
57
+ check_img_size,
58
+ check_imshow,
59
+ check_requirements,
60
+ colorstr,
61
+ cv2,
62
+ increment_path,
63
+ non_max_suppression,
64
+ print_args,
65
+ scale_boxes,
66
+ strip_optimizer,
67
+ xyxy2xywh,
68
+ )
69
+ from utils.plots import Annotator, colors, save_one_box
70
+ from utils.torch_utils import select_device, smart_inference_mode
71
+
72
+
73
+ @smart_inference_mode()
74
+ def run(
75
+ weights=ROOT / "yolov5s.pt", # model path or triton URL
76
+ source=ROOT / "data/images", # file/dir/URL/glob/screen/0(webcam)
77
+ data=ROOT / "data/coco128.yaml", # dataset.yaml path
78
+ imgsz=(640, 640), # inference size (height, width)
79
+ conf_thres=0.25, # confidence threshold
80
+ iou_thres=0.45, # NMS IOU threshold
81
+ max_det=1000, # maximum detections per image
82
+ device="", # cuda device, i.e. 0 or 0,1,2,3 or cpu
83
+ view_img=False, # show results
84
+ save_txt=False, # save results to *.txt
85
+ save_conf=False, # save confidences in --save-txt labels
86
+ save_crop=False, # save cropped prediction boxes
87
+ nosave=False, # do not save images/videos
88
+ classes=None, # filter by class: --class 0, or --class 0 2 3
89
+ agnostic_nms=False, # class-agnostic NMS
90
+ augment=False, # augmented inference
91
+ visualize=False, # visualize features
92
+ update=False, # update all models
93
+ project=ROOT / "runs/detect", # save results to project/name
94
+ name="exp", # save results to project/name
95
+ exist_ok=False, # existing project/name ok, do not increment
96
+ line_thickness=3, # bounding box thickness (pixels)
97
+ hide_labels=False, # hide labels
98
+ hide_conf=False, # hide confidences
99
+ half=False, # use FP16 half-precision inference
100
+ dnn=False, # use OpenCV DNN for ONNX inference
101
+ vid_stride=1, # video frame-rate stride
102
+ ):
103
+ source = str(source)
104
+ save_img = not nosave and not source.endswith(
105
+ ".txt"
106
+ ) # save inference images
107
+ is_file = Path(source).suffix[1:] in (IMG_FORMATS + VID_FORMATS)
108
+ is_url = source.lower().startswith(
109
+ ("rtsp://", "rtmp://", "http://", "https://")
110
+ )
111
+ webcam = (
112
+ source.isnumeric()
113
+ or source.endswith(".streams")
114
+ or (is_url and not is_file)
115
+ )
116
+ screenshot = source.lower().startswith("screen")
117
+ if is_url and is_file:
118
+ source = check_file(source) # download
119
+
120
+ # Directories
121
+ save_dir = increment_path(
122
+ Path(project) / name, exist_ok=exist_ok
123
+ ) # increment run
124
+ (save_dir / "labels" if save_txt else save_dir).mkdir(
125
+ parents=True, exist_ok=True
126
+ ) # make dir
127
+
128
+ # Load model
129
+ device = select_device(device)
130
+ model = DetectMultiBackend(
131
+ weights, device=device, dnn=dnn, data=data, fp16=half
132
+ )
133
+ stride, names, pt = model.stride, model.names, model.pt
134
+ imgsz = check_img_size(imgsz, s=stride) # check image size
135
+
136
+ # Dataloader
137
+ bs = 1 # batch_size
138
+ if webcam:
139
+ view_img = check_imshow(warn=True)
140
+ dataset = LoadStreams(
141
+ source,
142
+ img_size=imgsz,
143
+ stride=stride,
144
+ auto=pt,
145
+ vid_stride=vid_stride,
146
+ )
147
+ bs = len(dataset)
148
+ elif screenshot:
149
+ dataset = LoadScreenshots(
150
+ source, img_size=imgsz, stride=stride, auto=pt
151
+ )
152
+ else:
153
+ dataset = LoadImages(
154
+ source,
155
+ img_size=imgsz,
156
+ stride=stride,
157
+ auto=pt,
158
+ vid_stride=vid_stride,
159
+ )
160
+ vid_path, vid_writer = [None] * bs, [None] * bs
161
+
162
+ # Run inference
163
+ model.warmup(imgsz=(1 if pt or model.triton else bs, 3, *imgsz)) # warmup
164
+ seen, windows, dt = 0, [], (Profile(), Profile(), Profile())
165
+ for path, im, im0s, vid_cap, s in dataset:
166
+ with dt[0]:
167
+ im = torch.from_numpy(im).to(model.device)
168
+ im = im.half() if model.fp16 else im.float() # uint8 to fp16/32
169
+ im /= 255 # 0 - 255 to 0.0 - 1.0
170
+ if len(im.shape) == 3:
171
+ im = im[None] # expand for batch dim
172
+
173
+ # Inference
174
+ with dt[1]:
175
+ visualize = (
176
+ increment_path(save_dir / Path(path).stem, mkdir=True)
177
+ if visualize
178
+ else False
179
+ )
180
+ pred = model(im, augment=augment, visualize=visualize)
181
+
182
+ # NMS
183
+ with dt[2]:
184
+ pred = non_max_suppression(
185
+ pred,
186
+ conf_thres,
187
+ iou_thres,
188
+ classes,
189
+ agnostic_nms,
190
+ max_det=max_det,
191
+ )
192
+
193
+ # Second-stage classifier (optional)
194
+ # pred = utils.general.apply_classifier(pred, classifier_model, im, im0s)
195
+
196
+ # Process predictions
197
+ for i, det in enumerate(pred): # per image
198
+ seen += 1
199
+ if webcam: # batch_size >= 1
200
+ p, im0, frame = path[i], im0s[i].copy(), dataset.count
201
+ s += f"{i}: "
202
+ else:
203
+ p, im0, frame = path, im0s.copy(), getattr(dataset, "frame", 0)
204
+
205
+ p = Path(p) # to Path
206
+ save_path = str(save_dir / p.name) # im.jpg
207
+ txt_path = str(save_dir / "labels" / p.stem) + (
208
+ "" if dataset.mode == "image" else f"_{frame}"
209
+ ) # im.txt
210
+ s += "%gx%g " % im.shape[2:] # print string
211
+ gn = torch.tensor(im0.shape)[
212
+ [1, 0, 1, 0]
213
+ ] # normalization gain whwh
214
+ imc = im0.copy() if save_crop else im0 # for save_crop
215
+ annotator = Annotator(
216
+ im0, line_width=line_thickness, example=str(names)
217
+ )
218
+ if len(det):
219
+ # Rescale boxes from img_size to im0 size
220
+ det[:, :4] = scale_boxes(
221
+ im.shape[2:], det[:, :4], im0.shape
222
+ ).round()
223
+
224
+ # Print results
225
+ for c in det[:, 5].unique():
226
+ n = (det[:, 5] == c).sum() # detections per class
227
+ s += f"{n} {names[int(c)]}{'s' * (n > 1)}, " # add to string
228
+
229
+ # Write results
230
+ for *xyxy, conf, cls in reversed(det):
231
+ if save_txt: # Write to file
232
+ xywh = (
233
+ (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn)
234
+ .view(-1)
235
+ .tolist()
236
+ ) # normalized xywh
237
+ line = (
238
+ (cls, *xywh, conf) if save_conf else (cls, *xywh)
239
+ ) # label format
240
+ with open(f"{txt_path}.txt", "a") as f:
241
+ f.write(("%g " * len(line)).rstrip() % line + "\n")
242
+
243
+ if save_img or save_crop or view_img: # Add bbox to image
244
+ c = int(cls) # integer class
245
+ label = (
246
+ None
247
+ if hide_labels
248
+ else (
249
+ names[c]
250
+ if hide_conf
251
+ else f"{names[c]} {conf:.2f}"
252
+ )
253
+ )
254
+ annotator.box_label(xyxy, label, color=colors(c, True))
255
+ if save_crop:
256
+ save_one_box(
257
+ xyxy,
258
+ imc,
259
+ file=save_dir
260
+ / "crops"
261
+ / names[c]
262
+ / f"{p.stem}.jpg",
263
+ BGR=True,
264
+ )
265
+
266
+ # Stream results
267
+ im0 = annotator.result()
268
+ if view_img:
269
+ if platform.system() == "Linux" and p not in windows:
270
+ windows.append(p)
271
+ cv2.namedWindow(
272
+ str(p), cv2.WINDOW_NORMAL | cv2.WINDOW_KEEPRATIO
273
+ ) # allow window resize (Linux)
274
+ cv2.resizeWindow(str(p), im0.shape[1], im0.shape[0])
275
+ cv2.imshow(str(p), im0)
276
+ cv2.waitKey(1) # 1 millisecond
277
+
278
+ # Save results (image with detections)
279
+ if save_img:
280
+ if dataset.mode == "image":
281
+ cv2.imwrite(save_path, im0)
282
+ else: # 'video' or 'stream'
283
+ if vid_path[i] != save_path: # new video
284
+ vid_path[i] = save_path
285
+ if isinstance(vid_writer[i], cv2.VideoWriter):
286
+ vid_writer[
287
+ i
288
+ ].release() # release previous video writer
289
+ if vid_cap: # video
290
+ fps = vid_cap.get(cv2.CAP_PROP_FPS)
291
+ w = int(vid_cap.get(cv2.CAP_PROP_FRAME_WIDTH))
292
+ h = int(vid_cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
293
+ else: # stream
294
+ fps, w, h = 30, im0.shape[1], im0.shape[0]
295
+ save_path = str(
296
+ Path(save_path).with_suffix(".mp4")
297
+ ) # force *.mp4 suffix on results videos
298
+ vid_writer[i] = cv2.VideoWriter(
299
+ save_path,
300
+ cv2.VideoWriter_fourcc(*"mp4v"),
301
+ fps,
302
+ (w, h),
303
+ )
304
+ vid_writer[i].write(im0)
305
+
306
+ # Print time (inference-only)
307
+ LOGGER.info(
308
+ f"{s}{'' if len(det) else '(no detections), '}{dt[1].dt * 1E3:.1f}ms"
309
+ )
310
+
311
+ # Print results
312
+ t = tuple(x.t / seen * 1e3 for x in dt) # speeds per image
313
+ LOGGER.info(
314
+ f"Speed: %.1fms pre-process, %.1fms inference, %.1fms NMS per image at shape {(1, 3, *imgsz)}"
315
+ % t
316
+ )
317
+ if save_txt or save_img:
318
+ s = (
319
+ f"\n{len(list(save_dir.glob('labels/*.txt')))} labels saved to {save_dir / 'labels'}"
320
+ if save_txt
321
+ else ""
322
+ )
323
+ LOGGER.info(f"Results saved to {colorstr('bold', save_dir)}{s}")
324
+ if update:
325
+ strip_optimizer(
326
+ weights[0]
327
+ ) # update model (to fix SourceChangeWarning)
328
+
329
+
330
+ def parse_opt():
331
+ parser = argparse.ArgumentParser()
332
+ parser.add_argument(
333
+ "--weights",
334
+ nargs="+",
335
+ type=str,
336
+ default=ROOT / "yolov5s.pt",
337
+ help="model path or triton URL",
338
+ )
339
+ parser.add_argument(
340
+ "--source",
341
+ type=str,
342
+ default=ROOT / "data/images",
343
+ help="file/dir/URL/glob/screen/0(webcam)",
344
+ )
345
+ parser.add_argument(
346
+ "--data",
347
+ type=str,
348
+ default=ROOT / "data/coco128.yaml",
349
+ help="(optional) dataset.yaml path",
350
+ )
351
+ parser.add_argument(
352
+ "--imgsz",
353
+ "--img",
354
+ "--img-size",
355
+ nargs="+",
356
+ type=int,
357
+ default=[640],
358
+ help="inference size h,w",
359
+ )
360
+ parser.add_argument(
361
+ "--conf-thres", type=float, default=0.25, help="confidence threshold"
362
+ )
363
+ parser.add_argument(
364
+ "--iou-thres", type=float, default=0.45, help="NMS IoU threshold"
365
+ )
366
+ parser.add_argument(
367
+ "--max-det",
368
+ type=int,
369
+ default=1000,
370
+ help="maximum detections per image",
371
+ )
372
+ parser.add_argument(
373
+ "--device", default="", help="cuda device, i.e. 0 or 0,1,2,3 or cpu"
374
+ )
375
+ parser.add_argument("--view-img", action="store_true", help="show results")
376
+ parser.add_argument(
377
+ "--save-txt", action="store_true", help="save results to *.txt"
378
+ )
379
+ parser.add_argument(
380
+ "--save-conf",
381
+ action="store_true",
382
+ help="save confidences in --save-txt labels",
383
+ )
384
+ parser.add_argument(
385
+ "--save-crop",
386
+ action="store_true",
387
+ help="save cropped prediction boxes",
388
+ )
389
+ parser.add_argument(
390
+ "--nosave", action="store_true", help="do not save images/videos"
391
+ )
392
+ parser.add_argument(
393
+ "--classes",
394
+ nargs="+",
395
+ type=int,
396
+ help="filter by class: --classes 0, or --classes 0 2 3",
397
+ )
398
+ parser.add_argument(
399
+ "--agnostic-nms", action="store_true", help="class-agnostic NMS"
400
+ )
401
+ parser.add_argument(
402
+ "--augment", action="store_true", help="augmented inference"
403
+ )
404
+ parser.add_argument(
405
+ "--visualize", action="store_true", help="visualize features"
406
+ )
407
+ parser.add_argument(
408
+ "--update", action="store_true", help="update all models"
409
+ )
410
+ parser.add_argument(
411
+ "--project",
412
+ default=ROOT / "runs/detect",
413
+ help="save results to project/name",
414
+ )
415
+ parser.add_argument(
416
+ "--name", default="exp", help="save results to project/name"
417
+ )
418
+ parser.add_argument(
419
+ "--exist-ok",
420
+ action="store_true",
421
+ help="existing project/name ok, do not increment",
422
+ )
423
+ parser.add_argument(
424
+ "--line-thickness",
425
+ default=3,
426
+ type=int,
427
+ help="bounding box thickness (pixels)",
428
+ )
429
+ parser.add_argument(
430
+ "--hide-labels", default=False, action="store_true", help="hide labels"
431
+ )
432
+ parser.add_argument(
433
+ "--hide-conf",
434
+ default=False,
435
+ action="store_true",
436
+ help="hide confidences",
437
+ )
438
+ parser.add_argument(
439
+ "--half", action="store_true", help="use FP16 half-precision inference"
440
+ )
441
+ parser.add_argument(
442
+ "--dnn", action="store_true", help="use OpenCV DNN for ONNX inference"
443
+ )
444
+ parser.add_argument(
445
+ "--vid-stride", type=int, default=1, help="video frame-rate stride"
446
+ )
447
+ opt = parser.parse_args()
448
+ opt.imgsz *= 2 if len(opt.imgsz) == 1 else 1 # expand
449
+ print_args(vars(opt))
450
+ return opt
451
+
452
+
453
+ def main(opt):
454
+ check_requirements(exclude=("tensorboard", "thop"))
455
+ run(**vars(opt))
456
+
457
+
458
+ if __name__ == "__main__":
459
+ opt = parse_opt()
460
+ main(opt)
export.py ADDED
@@ -0,0 +1,1013 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
2
+ """
3
+ Export a YOLOv5 PyTorch model to other formats. TensorFlow exports authored by https://github.com/zldrobit
4
+
5
+ Format | `export.py --include` | Model
6
+ --- | --- | ---
7
+ PyTorch | - | yolov5s.pt
8
+ TorchScript | `torchscript` | yolov5s.torchscript
9
+ ONNX | `onnx` | yolov5s.onnx
10
+ OpenVINO | `openvino` | yolov5s_openvino_model/
11
+ TensorRT | `engine` | yolov5s.engine
12
+ CoreML | `coreml` | yolov5s.mlmodel
13
+ TensorFlow SavedModel | `saved_model` | yolov5s_saved_model/
14
+ TensorFlow GraphDef | `pb` | yolov5s.pb
15
+ TensorFlow Lite | `tflite` | yolov5s.tflite
16
+ TensorFlow Edge TPU | `edgetpu` | yolov5s_edgetpu.tflite
17
+ TensorFlow.js | `tfjs` | yolov5s_web_model/
18
+ PaddlePaddle | `paddle` | yolov5s_paddle_model/
19
+
20
+ Requirements:
21
+ $ pip install -r requirements.txt coremltools onnx onnx-simplifier onnxruntime openvino-dev tensorflow-cpu # CPU
22
+ $ pip install -r requirements.txt coremltools onnx onnx-simplifier onnxruntime-gpu openvino-dev tensorflow # GPU
23
+
24
+ Usage:
25
+ $ python export.py --weights yolov5s.pt --include torchscript onnx openvino engine coreml tflite ...
26
+
27
+ Inference:
28
+ $ python detect.py --weights yolov5s.pt # PyTorch
29
+ yolov5s.torchscript # TorchScript
30
+ yolov5s.onnx # ONNX Runtime or OpenCV DNN with --dnn
31
+ yolov5s_openvino_model # OpenVINO
32
+ yolov5s.engine # TensorRT
33
+ yolov5s.mlmodel # CoreML (macOS-only)
34
+ yolov5s_saved_model # TensorFlow SavedModel
35
+ yolov5s.pb # TensorFlow GraphDef
36
+ yolov5s.tflite # TensorFlow Lite
37
+ yolov5s_edgetpu.tflite # TensorFlow Edge TPU
38
+ yolov5s_paddle_model # PaddlePaddle
39
+
40
+ TensorFlow.js:
41
+ $ cd .. && git clone https://github.com/zldrobit/tfjs-yolov5-example.git && cd tfjs-yolov5-example
42
+ $ npm install
43
+ $ ln -s ../../yolov5/yolov5s_web_model public/yolov5s_web_model
44
+ $ npm start
45
+ """
46
+
47
+ import argparse
48
+ import contextlib
49
+ import json
50
+ import os
51
+ import platform
52
+ import re
53
+ import subprocess
54
+ import sys
55
+ import time
56
+ import warnings
57
+ from pathlib import Path
58
+
59
+ import pandas as pd
60
+ import torch
61
+ from torch.utils.mobile_optimizer import optimize_for_mobile
62
+
63
+ FILE = Path(__file__).resolve()
64
+ ROOT = FILE.parents[0] # YOLOv5 root directory
65
+ if str(ROOT) not in sys.path:
66
+ sys.path.append(str(ROOT)) # add ROOT to PATH
67
+ if platform.system() != "Windows":
68
+ ROOT = Path(os.path.relpath(ROOT, Path.cwd())) # relative
69
+
70
+ from models.experimental import attempt_load
71
+ from models.yolo import ClassificationModel, Detect, DetectionModel, SegmentationModel
72
+ from utils.dataloaders import LoadImages
73
+ from utils.general import (
74
+ LOGGER,
75
+ Profile,
76
+ check_dataset,
77
+ check_img_size,
78
+ check_requirements,
79
+ check_version,
80
+ check_yaml,
81
+ colorstr,
82
+ file_size,
83
+ get_default_args,
84
+ print_args,
85
+ url2file,
86
+ yaml_save,
87
+ )
88
+ from utils.torch_utils import select_device, smart_inference_mode
89
+
90
+ MACOS = platform.system() == "Darwin" # macOS environment
91
+
92
+
93
+ def export_formats():
94
+ # YOLOv5 export formats
95
+ x = [
96
+ ["PyTorch", "-", ".pt", True, True],
97
+ ["TorchScript", "torchscript", ".torchscript", True, True],
98
+ ["ONNX", "onnx", ".onnx", True, True],
99
+ ["OpenVINO", "openvino", "_openvino_model", True, False],
100
+ ["TensorRT", "engine", ".engine", False, True],
101
+ ["CoreML", "coreml", ".mlmodel", True, False],
102
+ ["TensorFlow SavedModel", "saved_model", "_saved_model", True, True],
103
+ ["TensorFlow GraphDef", "pb", ".pb", True, True],
104
+ ["TensorFlow Lite", "tflite", ".tflite", True, False],
105
+ ["TensorFlow Edge TPU", "edgetpu", "_edgetpu.tflite", False, False],
106
+ ["TensorFlow.js", "tfjs", "_web_model", False, False],
107
+ ["PaddlePaddle", "paddle", "_paddle_model", True, True],
108
+ ]
109
+ return pd.DataFrame(
110
+ x, columns=["Format", "Argument", "Suffix", "CPU", "GPU"]
111
+ )
112
+
113
+
114
+ def try_export(inner_func):
115
+ # YOLOv5 export decorator, i..e @try_export
116
+ inner_args = get_default_args(inner_func)
117
+
118
+ def outer_func(*args, **kwargs):
119
+ prefix = inner_args["prefix"]
120
+ try:
121
+ with Profile() as dt:
122
+ f, model = inner_func(*args, **kwargs)
123
+ LOGGER.info(
124
+ f"{prefix} export success ✅ {dt.t:.1f}s, saved as {f} ({file_size(f):.1f} MB)"
125
+ )
126
+ return f, model
127
+ except Exception as e:
128
+ LOGGER.info(f"{prefix} export failure ❌ {dt.t:.1f}s: {e}")
129
+ return None, None
130
+
131
+ return outer_func
132
+
133
+
134
+ @try_export
135
+ def export_torchscript(
136
+ model, im, file, optimize, prefix=colorstr("TorchScript:")
137
+ ):
138
+ # YOLOv5 TorchScript model export
139
+ LOGGER.info(
140
+ f"\n{prefix} starting export with torch {torch.__version__}..."
141
+ )
142
+ f = file.with_suffix(".torchscript")
143
+
144
+ ts = torch.jit.trace(model, im, strict=False)
145
+ d = {
146
+ "shape": im.shape,
147
+ "stride": int(max(model.stride)),
148
+ "names": model.names,
149
+ }
150
+ extra_files = {"config.txt": json.dumps(d)} # torch._C.ExtraFilesMap()
151
+ if (
152
+ optimize
153
+ ): # https://pytorch.org/tutorials/recipes/mobile_interpreter.html
154
+ optimize_for_mobile(ts)._save_for_lite_interpreter(
155
+ str(f), _extra_files=extra_files
156
+ )
157
+ else:
158
+ ts.save(str(f), _extra_files=extra_files)
159
+ return f, None
160
+
161
+
162
+ @try_export
163
+ def export_onnx(
164
+ model, im, file, opset, dynamic, simplify, prefix=colorstr("ONNX:")
165
+ ):
166
+ # YOLOv5 ONNX export
167
+ check_requirements("onnx>=1.12.0")
168
+ import onnx
169
+
170
+ LOGGER.info(f"\n{prefix} starting export with onnx {onnx.__version__}...")
171
+ f = file.with_suffix(".onnx")
172
+
173
+ output_names = (
174
+ ["output0", "output1"]
175
+ if isinstance(model, SegmentationModel)
176
+ else ["output0"]
177
+ )
178
+ if dynamic:
179
+ dynamic = {
180
+ "images": {0: "batch", 2: "height", 3: "width"}
181
+ } # shape(1,3,640,640)
182
+ if isinstance(model, SegmentationModel):
183
+ dynamic["output0"] = {
184
+ 0: "batch",
185
+ 1: "anchors",
186
+ } # shape(1,25200,85)
187
+ dynamic["output1"] = {
188
+ 0: "batch",
189
+ 2: "mask_height",
190
+ 3: "mask_width",
191
+ } # shape(1,32,160,160)
192
+ elif isinstance(model, DetectionModel):
193
+ dynamic["output0"] = {
194
+ 0: "batch",
195
+ 1: "anchors",
196
+ } # shape(1,25200,85)
197
+
198
+ torch.onnx.export(
199
+ model.cpu()
200
+ if dynamic
201
+ else model, # --dynamic only compatible with cpu
202
+ im.cpu() if dynamic else im,
203
+ f,
204
+ verbose=False,
205
+ opset_version=opset,
206
+ do_constant_folding=True, # WARNING: DNN inference with torch>=1.12 may require do_constant_folding=False
207
+ input_names=["images"],
208
+ output_names=output_names,
209
+ dynamic_axes=dynamic or None,
210
+ )
211
+
212
+ # Checks
213
+ model_onnx = onnx.load(f) # load onnx model
214
+ onnx.checker.check_model(model_onnx) # check onnx model
215
+
216
+ # Metadata
217
+ d = {"stride": int(max(model.stride)), "names": model.names}
218
+ for k, v in d.items():
219
+ meta = model_onnx.metadata_props.add()
220
+ meta.key, meta.value = k, str(v)
221
+ onnx.save(model_onnx, f)
222
+
223
+ # Simplify
224
+ if simplify:
225
+ try:
226
+ cuda = torch.cuda.is_available()
227
+ check_requirements(
228
+ (
229
+ "onnxruntime-gpu" if cuda else "onnxruntime",
230
+ "onnx-simplifier>=0.4.1",
231
+ )
232
+ )
233
+ import onnxsim
234
+
235
+ LOGGER.info(
236
+ f"{prefix} simplifying with onnx-simplifier {onnxsim.__version__}..."
237
+ )
238
+ model_onnx, check = onnxsim.simplify(model_onnx)
239
+ assert check, "assert check failed"
240
+ onnx.save(model_onnx, f)
241
+ except Exception as e:
242
+ LOGGER.info(f"{prefix} simplifier failure: {e}")
243
+ return f, model_onnx
244
+
245
+
246
+ @try_export
247
+ def export_openvino(file, metadata, half, prefix=colorstr("OpenVINO:")):
248
+ # YOLOv5 OpenVINO export
249
+ check_requirements(
250
+ "openvino-dev"
251
+ ) # requires openvino-dev: https://pypi.org/project/openvino-dev/
252
+ import openvino.inference_engine as ie
253
+
254
+ LOGGER.info(
255
+ f"\n{prefix} starting export with openvino {ie.__version__}..."
256
+ )
257
+ f = str(file).replace(".pt", f"_openvino_model{os.sep}")
258
+
259
+ cmd = f"mo --input_model {file.with_suffix('.onnx')} --output_dir {f} --data_type {'FP16' if half else 'FP32'}"
260
+ subprocess.run(cmd.split(), check=True, env=os.environ) # export
261
+ yaml_save(
262
+ Path(f) / file.with_suffix(".yaml").name, metadata
263
+ ) # add metadata.yaml
264
+ return f, None
265
+
266
+
267
+ @try_export
268
+ def export_paddle(model, im, file, metadata, prefix=colorstr("PaddlePaddle:")):
269
+ # YOLOv5 Paddle export
270
+ check_requirements(("paddlepaddle", "x2paddle"))
271
+ import x2paddle
272
+ from x2paddle.convert import pytorch2paddle
273
+
274
+ LOGGER.info(
275
+ f"\n{prefix} starting export with X2Paddle {x2paddle.__version__}..."
276
+ )
277
+ f = str(file).replace(".pt", f"_paddle_model{os.sep}")
278
+
279
+ pytorch2paddle(
280
+ module=model, save_dir=f, jit_type="trace", input_examples=[im]
281
+ ) # export
282
+ yaml_save(
283
+ Path(f) / file.with_suffix(".yaml").name, metadata
284
+ ) # add metadata.yaml
285
+ return f, None
286
+
287
+
288
+ @try_export
289
+ def export_coreml(model, im, file, int8, half, prefix=colorstr("CoreML:")):
290
+ # YOLOv5 CoreML export
291
+ check_requirements("coremltools")
292
+ import coremltools as ct
293
+
294
+ LOGGER.info(
295
+ f"\n{prefix} starting export with coremltools {ct.__version__}..."
296
+ )
297
+ f = file.with_suffix(".mlmodel")
298
+
299
+ ts = torch.jit.trace(model, im, strict=False) # TorchScript model
300
+ ct_model = ct.convert(
301
+ ts,
302
+ inputs=[
303
+ ct.ImageType(
304
+ "image", shape=im.shape, scale=1 / 255, bias=[0, 0, 0]
305
+ )
306
+ ],
307
+ )
308
+ bits, mode = (
309
+ (8, "kmeans_lut") if int8 else (16, "linear") if half else (32, None)
310
+ )
311
+ if bits < 32:
312
+ if MACOS: # quantization only supported on macOS
313
+ with warnings.catch_warnings():
314
+ warnings.filterwarnings(
315
+ "ignore", category=DeprecationWarning
316
+ ) # suppress numpy==1.20 float warning
317
+ ct_model = ct.models.neural_network.quantization_utils.quantize_weights(
318
+ ct_model, bits, mode
319
+ )
320
+ else:
321
+ print(
322
+ f"{prefix} quantization only supported on macOS, skipping..."
323
+ )
324
+ ct_model.save(f)
325
+ return f, ct_model
326
+
327
+
328
+ @try_export
329
+ def export_engine(
330
+ model,
331
+ im,
332
+ file,
333
+ half,
334
+ dynamic,
335
+ simplify,
336
+ workspace=4,
337
+ verbose=False,
338
+ prefix=colorstr("TensorRT:"),
339
+ ):
340
+ # YOLOv5 TensorRT export https://developer.nvidia.com/tensorrt
341
+ assert (
342
+ im.device.type != "cpu"
343
+ ), "export running on CPU but must be on GPU, i.e. `python export.py --device 0`"
344
+ try:
345
+ import tensorrt as trt
346
+ except Exception:
347
+ if platform.system() == "Linux":
348
+ check_requirements(
349
+ "nvidia-tensorrt",
350
+ cmds="-U --index-url https://pypi.ngc.nvidia.com",
351
+ )
352
+ import tensorrt as trt
353
+
354
+ if (
355
+ trt.__version__[0] == "7"
356
+ ): # TensorRT 7 handling https://github.com/ultralytics/yolov5/issues/6012
357
+ grid = model.model[-1].anchor_grid
358
+ model.model[-1].anchor_grid = [a[..., :1, :1, :] for a in grid]
359
+ export_onnx(model, im, file, 12, dynamic, simplify) # opset 12
360
+ model.model[-1].anchor_grid = grid
361
+ else: # TensorRT >= 8
362
+ check_version(
363
+ trt.__version__, "8.0.0", hard=True
364
+ ) # require tensorrt>=8.0.0
365
+ export_onnx(model, im, file, 12, dynamic, simplify) # opset 12
366
+ onnx = file.with_suffix(".onnx")
367
+
368
+ LOGGER.info(
369
+ f"\n{prefix} starting export with TensorRT {trt.__version__}..."
370
+ )
371
+ assert onnx.exists(), f"failed to export ONNX file: {onnx}"
372
+ f = file.with_suffix(".engine") # TensorRT engine file
373
+ logger = trt.Logger(trt.Logger.INFO)
374
+ if verbose:
375
+ logger.min_severity = trt.Logger.Severity.VERBOSE
376
+
377
+ builder = trt.Builder(logger)
378
+ config = builder.create_builder_config()
379
+ config.max_workspace_size = workspace * 1 << 30
380
+ # config.set_memory_pool_limit(trt.MemoryPoolType.WORKSPACE, workspace << 30) # fix TRT 8.4 deprecation notice
381
+
382
+ flag = 1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
383
+ network = builder.create_network(flag)
384
+ parser = trt.OnnxParser(network, logger)
385
+ if not parser.parse_from_file(str(onnx)):
386
+ raise RuntimeError(f"failed to load ONNX file: {onnx}")
387
+
388
+ inputs = [network.get_input(i) for i in range(network.num_inputs)]
389
+ outputs = [network.get_output(i) for i in range(network.num_outputs)]
390
+ for inp in inputs:
391
+ LOGGER.info(
392
+ f'{prefix} input "{inp.name}" with shape{inp.shape} {inp.dtype}'
393
+ )
394
+ for out in outputs:
395
+ LOGGER.info(
396
+ f'{prefix} output "{out.name}" with shape{out.shape} {out.dtype}'
397
+ )
398
+
399
+ if dynamic:
400
+ if im.shape[0] <= 1:
401
+ LOGGER.warning(
402
+ f"{prefix} WARNING ⚠️ --dynamic model requires maximum --batch-size argument"
403
+ )
404
+ profile = builder.create_optimization_profile()
405
+ for inp in inputs:
406
+ profile.set_shape(
407
+ inp.name,
408
+ (1, *im.shape[1:]),
409
+ (max(1, im.shape[0] // 2), *im.shape[1:]),
410
+ im.shape,
411
+ )
412
+ config.add_optimization_profile(profile)
413
+
414
+ LOGGER.info(
415
+ f"{prefix} building FP{16 if builder.platform_has_fast_fp16 and half else 32} engine as {f}"
416
+ )
417
+ if builder.platform_has_fast_fp16 and half:
418
+ config.set_flag(trt.BuilderFlag.FP16)
419
+ with builder.build_engine(network, config) as engine, open(f, "wb") as t:
420
+ t.write(engine.serialize())
421
+ return f, None
422
+
423
+
424
+ @try_export
425
+ def export_saved_model(
426
+ model,
427
+ im,
428
+ file,
429
+ dynamic,
430
+ tf_nms=False,
431
+ agnostic_nms=False,
432
+ topk_per_class=100,
433
+ topk_all=100,
434
+ iou_thres=0.45,
435
+ conf_thres=0.25,
436
+ keras=False,
437
+ prefix=colorstr("TensorFlow SavedModel:"),
438
+ ):
439
+ # YOLOv5 TensorFlow SavedModel export
440
+ try:
441
+ import tensorflow as tf
442
+ except Exception:
443
+ check_requirements(
444
+ f"tensorflow{'' if torch.cuda.is_available() else '-macos' if MACOS else '-cpu'}"
445
+ )
446
+ import tensorflow as tf
447
+ from tensorflow.python.framework.convert_to_constants import (
448
+ convert_variables_to_constants_v2,
449
+ )
450
+
451
+ from models.tf import TFModel
452
+
453
+ LOGGER.info(
454
+ f"\n{prefix} starting export with tensorflow {tf.__version__}..."
455
+ )
456
+ f = str(file).replace(".pt", "_saved_model")
457
+ batch_size, ch, *imgsz = list(im.shape) # BCHW
458
+
459
+ tf_model = TFModel(cfg=model.yaml, model=model, nc=model.nc, imgsz=imgsz)
460
+ im = tf.zeros((batch_size, *imgsz, ch)) # BHWC order for TensorFlow
461
+ _ = tf_model.predict(
462
+ im,
463
+ tf_nms,
464
+ agnostic_nms,
465
+ topk_per_class,
466
+ topk_all,
467
+ iou_thres,
468
+ conf_thres,
469
+ )
470
+ inputs = tf.keras.Input(
471
+ shape=(*imgsz, ch), batch_size=None if dynamic else batch_size
472
+ )
473
+ outputs = tf_model.predict(
474
+ inputs,
475
+ tf_nms,
476
+ agnostic_nms,
477
+ topk_per_class,
478
+ topk_all,
479
+ iou_thres,
480
+ conf_thres,
481
+ )
482
+ keras_model = tf.keras.Model(inputs=inputs, outputs=outputs)
483
+ keras_model.trainable = False
484
+ keras_model.summary()
485
+ if keras:
486
+ keras_model.save(f, save_format="tf")
487
+ else:
488
+ spec = tf.TensorSpec(
489
+ keras_model.inputs[0].shape, keras_model.inputs[0].dtype
490
+ )
491
+ m = tf.function(lambda x: keras_model(x)) # full model
492
+ m = m.get_concrete_function(spec)
493
+ frozen_func = convert_variables_to_constants_v2(m)
494
+ tfm = tf.Module()
495
+ tfm.__call__ = tf.function(
496
+ lambda x: frozen_func(x)[:4] if tf_nms else frozen_func(x), [spec]
497
+ )
498
+ tfm.__call__(im)
499
+ tf.saved_model.save(
500
+ tfm,
501
+ f,
502
+ options=tf.saved_model.SaveOptions(
503
+ experimental_custom_gradients=False
504
+ )
505
+ if check_version(tf.__version__, "2.6")
506
+ else tf.saved_model.SaveOptions(),
507
+ )
508
+ return f, keras_model
509
+
510
+
511
+ @try_export
512
+ def export_pb(keras_model, file, prefix=colorstr("TensorFlow GraphDef:")):
513
+ # YOLOv5 TensorFlow GraphDef *.pb export https://github.com/leimao/Frozen_Graph_TensorFlow
514
+ import tensorflow as tf
515
+ from tensorflow.python.framework.convert_to_constants import (
516
+ convert_variables_to_constants_v2,
517
+ )
518
+
519
+ LOGGER.info(
520
+ f"\n{prefix} starting export with tensorflow {tf.__version__}..."
521
+ )
522
+ f = file.with_suffix(".pb")
523
+
524
+ m = tf.function(lambda x: keras_model(x)) # full model
525
+ m = m.get_concrete_function(
526
+ tf.TensorSpec(keras_model.inputs[0].shape, keras_model.inputs[0].dtype)
527
+ )
528
+ frozen_func = convert_variables_to_constants_v2(m)
529
+ frozen_func.graph.as_graph_def()
530
+ tf.io.write_graph(
531
+ graph_or_graph_def=frozen_func.graph,
532
+ logdir=str(f.parent),
533
+ name=f.name,
534
+ as_text=False,
535
+ )
536
+ return f, None
537
+
538
+
539
+ @try_export
540
+ def export_tflite(
541
+ keras_model,
542
+ im,
543
+ file,
544
+ int8,
545
+ data,
546
+ nms,
547
+ agnostic_nms,
548
+ prefix=colorstr("TensorFlow Lite:"),
549
+ ):
550
+ # YOLOv5 TensorFlow Lite export
551
+ import tensorflow as tf
552
+
553
+ LOGGER.info(
554
+ f"\n{prefix} starting export with tensorflow {tf.__version__}..."
555
+ )
556
+ batch_size, ch, *imgsz = list(im.shape) # BCHW
557
+ f = str(file).replace(".pt", "-fp16.tflite")
558
+
559
+ converter = tf.lite.TFLiteConverter.from_keras_model(keras_model)
560
+ converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS]
561
+ converter.target_spec.supported_types = [tf.float16]
562
+ converter.optimizations = [tf.lite.Optimize.DEFAULT]
563
+ if int8:
564
+ from models.tf import representative_dataset_gen
565
+
566
+ dataset = LoadImages(
567
+ check_dataset(check_yaml(data))["train"],
568
+ img_size=imgsz,
569
+ auto=False,
570
+ )
571
+ converter.representative_dataset = lambda: representative_dataset_gen(
572
+ dataset, ncalib=100
573
+ )
574
+ converter.target_spec.supported_ops = [
575
+ tf.lite.OpsSet.TFLITE_BUILTINS_INT8
576
+ ]
577
+ converter.target_spec.supported_types = []
578
+ converter.inference_input_type = tf.uint8 # or tf.int8
579
+ converter.inference_output_type = tf.uint8 # or tf.int8
580
+ converter.experimental_new_quantizer = True
581
+ f = str(file).replace(".pt", "-int8.tflite")
582
+ if nms or agnostic_nms:
583
+ converter.target_spec.supported_ops.append(
584
+ tf.lite.OpsSet.SELECT_TF_OPS
585
+ )
586
+
587
+ tflite_model = converter.convert()
588
+ open(f, "wb").write(tflite_model)
589
+ return f, None
590
+
591
+
592
+ @try_export
593
+ def export_edgetpu(file, prefix=colorstr("Edge TPU:")):
594
+ # YOLOv5 Edge TPU export https://coral.ai/docs/edgetpu/models-intro/
595
+ cmd = "edgetpu_compiler --version"
596
+ help_url = "https://coral.ai/docs/edgetpu/compiler/"
597
+ assert (
598
+ platform.system() == "Linux"
599
+ ), f"export only supported on Linux. See {help_url}"
600
+ if subprocess.run(f"{cmd} >/dev/null", shell=True).returncode != 0:
601
+ LOGGER.info(
602
+ f"\n{prefix} export requires Edge TPU compiler. Attempting install from {help_url}"
603
+ )
604
+ sudo = (
605
+ subprocess.run("sudo --version >/dev/null", shell=True).returncode
606
+ == 0
607
+ ) # sudo installed on system
608
+ for c in (
609
+ "curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -",
610
+ 'echo "deb https://packages.cloud.google.com/apt coral-edgetpu-stable main" | sudo tee /etc/apt/sources.list.d/coral-edgetpu.list',
611
+ "sudo apt-get update",
612
+ "sudo apt-get install edgetpu-compiler",
613
+ ):
614
+ subprocess.run(
615
+ c if sudo else c.replace("sudo ", ""), shell=True, check=True
616
+ )
617
+ ver = (
618
+ subprocess.run(cmd, shell=True, capture_output=True, check=True)
619
+ .stdout.decode()
620
+ .split()[-1]
621
+ )
622
+
623
+ LOGGER.info(f"\n{prefix} starting export with Edge TPU compiler {ver}...")
624
+ f = str(file).replace(".pt", "-int8_edgetpu.tflite") # Edge TPU model
625
+ f_tfl = str(file).replace(".pt", "-int8.tflite") # TFLite model
626
+
627
+ cmd = f"edgetpu_compiler -s -d -k 10 --out_dir {file.parent} {f_tfl}"
628
+ subprocess.run(cmd.split(), check=True)
629
+ return f, None
630
+
631
+
632
+ @try_export
633
+ def export_tfjs(file, prefix=colorstr("TensorFlow.js:")):
634
+ # YOLOv5 TensorFlow.js export
635
+ check_requirements("tensorflowjs")
636
+ import tensorflowjs as tfjs
637
+
638
+ LOGGER.info(
639
+ f"\n{prefix} starting export with tensorflowjs {tfjs.__version__}..."
640
+ )
641
+ f = str(file).replace(".pt", "_web_model") # js dir
642
+ f_pb = file.with_suffix(".pb") # *.pb path
643
+ f_json = f"{f}/model.json" # *.json path
644
+
645
+ cmd = (
646
+ f"tensorflowjs_converter --input_format=tf_frozen_model "
647
+ f"--output_node_names=Identity,Identity_1,Identity_2,Identity_3 {f_pb} {f}"
648
+ )
649
+ subprocess.run(cmd.split())
650
+
651
+ json = Path(f_json).read_text()
652
+ with open(f_json, "w") as j: # sort JSON Identity_* in ascending order
653
+ subst = re.sub(
654
+ r'{"outputs": {"Identity.?.?": {"name": "Identity.?.?"}, '
655
+ r'"Identity.?.?": {"name": "Identity.?.?"}, '
656
+ r'"Identity.?.?": {"name": "Identity.?.?"}, '
657
+ r'"Identity.?.?": {"name": "Identity.?.?"}}}',
658
+ r'{"outputs": {"Identity": {"name": "Identity"}, '
659
+ r'"Identity_1": {"name": "Identity_1"}, '
660
+ r'"Identity_2": {"name": "Identity_2"}, '
661
+ r'"Identity_3": {"name": "Identity_3"}}}',
662
+ json,
663
+ )
664
+ j.write(subst)
665
+ return f, None
666
+
667
+
668
+ def add_tflite_metadata(file, metadata, num_outputs):
669
+ # Add metadata to *.tflite models per https://www.tensorflow.org/lite/models/convert/metadata
670
+ with contextlib.suppress(ImportError):
671
+ # check_requirements('tflite_support')
672
+ from tflite_support import flatbuffers
673
+ from tflite_support import metadata as _metadata
674
+ from tflite_support import metadata_schema_py_generated as _metadata_fb
675
+
676
+ tmp_file = Path("/tmp/meta.txt")
677
+ with open(tmp_file, "w") as meta_f:
678
+ meta_f.write(str(metadata))
679
+
680
+ model_meta = _metadata_fb.ModelMetadataT()
681
+ label_file = _metadata_fb.AssociatedFileT()
682
+ label_file.name = tmp_file.name
683
+ model_meta.associatedFiles = [label_file]
684
+
685
+ subgraph = _metadata_fb.SubGraphMetadataT()
686
+ subgraph.inputTensorMetadata = [_metadata_fb.TensorMetadataT()]
687
+ subgraph.outputTensorMetadata = [
688
+ _metadata_fb.TensorMetadataT()
689
+ ] * num_outputs
690
+ model_meta.subgraphMetadata = [subgraph]
691
+
692
+ b = flatbuffers.Builder(0)
693
+ b.Finish(
694
+ model_meta.Pack(b),
695
+ _metadata.MetadataPopulator.METADATA_FILE_IDENTIFIER,
696
+ )
697
+ metadata_buf = b.Output()
698
+
699
+ populator = _metadata.MetadataPopulator.with_model_file(file)
700
+ populator.load_metadata_buffer(metadata_buf)
701
+ populator.load_associated_files([str(tmp_file)])
702
+ populator.populate()
703
+ tmp_file.unlink()
704
+
705
+
706
+ @smart_inference_mode()
707
+ def run(
708
+ data=ROOT / "data/coco128.yaml", # 'dataset.yaml path'
709
+ weights=ROOT / "yolov5s.pt", # weights path
710
+ imgsz=(640, 640), # image (height, width)
711
+ batch_size=1, # batch size
712
+ device="cpu", # cuda device, i.e. 0 or 0,1,2,3 or cpu
713
+ include=("torchscript", "onnx"), # include formats
714
+ half=False, # FP16 half-precision export
715
+ inplace=False, # set YOLOv5 Detect() inplace=True
716
+ keras=False, # use Keras
717
+ optimize=False, # TorchScript: optimize for mobile
718
+ int8=False, # CoreML/TF INT8 quantization
719
+ dynamic=False, # ONNX/TF/TensorRT: dynamic axes
720
+ simplify=False, # ONNX: simplify model
721
+ opset=12, # ONNX: opset version
722
+ verbose=False, # TensorRT: verbose log
723
+ workspace=4, # TensorRT: workspace size (GB)
724
+ nms=False, # TF: add NMS to model
725
+ agnostic_nms=False, # TF: add agnostic NMS to model
726
+ topk_per_class=100, # TF.js NMS: topk per class to keep
727
+ topk_all=100, # TF.js NMS: topk for all classes to keep
728
+ iou_thres=0.45, # TF.js NMS: IoU threshold
729
+ conf_thres=0.25, # TF.js NMS: confidence threshold
730
+ ):
731
+ t = time.time()
732
+ include = [x.lower() for x in include] # to lowercase
733
+ fmts = tuple(export_formats()["Argument"][1:]) # --include arguments
734
+ flags = [x in include for x in fmts]
735
+ assert sum(flags) == len(
736
+ include
737
+ ), f"ERROR: Invalid --include {include}, valid --include arguments are {fmts}"
738
+ (
739
+ jit,
740
+ onnx,
741
+ xml,
742
+ engine,
743
+ coreml,
744
+ saved_model,
745
+ pb,
746
+ tflite,
747
+ edgetpu,
748
+ tfjs,
749
+ paddle,
750
+ ) = flags # export booleans
751
+ file = Path(
752
+ url2file(weights)
753
+ if str(weights).startswith(("http:/", "https:/"))
754
+ else weights
755
+ ) # PyTorch weights
756
+
757
+ # Load PyTorch model
758
+ device = select_device(device)
759
+ if half:
760
+ assert (
761
+ device.type != "cpu" or coreml
762
+ ), "--half only compatible with GPU export, i.e. use --device 0"
763
+ assert (
764
+ not dynamic
765
+ ), "--half not compatible with --dynamic, i.e. use either --half or --dynamic but not both"
766
+ model = attempt_load(
767
+ weights, device=device, inplace=True, fuse=True
768
+ ) # load FP32 model
769
+
770
+ # Checks
771
+ imgsz *= 2 if len(imgsz) == 1 else 1 # expand
772
+ if optimize:
773
+ assert (
774
+ device.type == "cpu"
775
+ ), "--optimize not compatible with cuda devices, i.e. use --device cpu"
776
+
777
+ # Input
778
+ gs = int(max(model.stride)) # grid size (max stride)
779
+ imgsz = [
780
+ check_img_size(x, gs) for x in imgsz
781
+ ] # verify img_size are gs-multiples
782
+ im = torch.zeros(batch_size, 3, *imgsz).to(
783
+ device
784
+ ) # image size(1,3,320,192) BCHW iDetection
785
+
786
+ # Update model
787
+ model.eval()
788
+ for k, m in model.named_modules():
789
+ if isinstance(m, Detect):
790
+ m.inplace = inplace
791
+ m.dynamic = dynamic
792
+ m.export = True
793
+
794
+ for _ in range(2):
795
+ y = model(im) # dry runs
796
+ if half and not coreml:
797
+ im, model = im.half(), model.half() # to FP16
798
+ shape = tuple(
799
+ (y[0] if isinstance(y, tuple) else y).shape
800
+ ) # model output shape
801
+ metadata = {
802
+ "stride": int(max(model.stride)),
803
+ "names": model.names,
804
+ } # model metadata
805
+ LOGGER.info(
806
+ f"\n{colorstr('PyTorch:')} starting from {file} with output shape {shape} ({file_size(file):.1f} MB)"
807
+ )
808
+
809
+ # Exports
810
+ f = [""] * len(fmts) # exported filenames
811
+ warnings.filterwarnings(
812
+ action="ignore", category=torch.jit.TracerWarning
813
+ ) # suppress TracerWarning
814
+ if jit: # TorchScript
815
+ f[0], _ = export_torchscript(model, im, file, optimize)
816
+ if engine: # TensorRT required before ONNX
817
+ f[1], _ = export_engine(
818
+ model, im, file, half, dynamic, simplify, workspace, verbose
819
+ )
820
+ if onnx or xml: # OpenVINO requires ONNX
821
+ f[2], _ = export_onnx(model, im, file, opset, dynamic, simplify)
822
+ if xml: # OpenVINO
823
+ f[3], _ = export_openvino(file, metadata, half)
824
+ if coreml: # CoreML
825
+ f[4], _ = export_coreml(model, im, file, int8, half)
826
+ if any((saved_model, pb, tflite, edgetpu, tfjs)): # TensorFlow formats
827
+ assert (
828
+ not tflite or not tfjs
829
+ ), "TFLite and TF.js models must be exported separately, please pass only one type."
830
+ assert not isinstance(
831
+ model, ClassificationModel
832
+ ), "ClassificationModel export to TF formats not yet supported."
833
+ f[5], s_model = export_saved_model(
834
+ model.cpu(),
835
+ im,
836
+ file,
837
+ dynamic,
838
+ tf_nms=nms or agnostic_nms or tfjs,
839
+ agnostic_nms=agnostic_nms or tfjs,
840
+ topk_per_class=topk_per_class,
841
+ topk_all=topk_all,
842
+ iou_thres=iou_thres,
843
+ conf_thres=conf_thres,
844
+ keras=keras,
845
+ )
846
+ if pb or tfjs: # pb prerequisite to tfjs
847
+ f[6], _ = export_pb(s_model, file)
848
+ if tflite or edgetpu:
849
+ f[7], _ = export_tflite(
850
+ s_model,
851
+ im,
852
+ file,
853
+ int8 or edgetpu,
854
+ data=data,
855
+ nms=nms,
856
+ agnostic_nms=agnostic_nms,
857
+ )
858
+ if edgetpu:
859
+ f[8], _ = export_edgetpu(file)
860
+ add_tflite_metadata(
861
+ f[8] or f[7], metadata, num_outputs=len(s_model.outputs)
862
+ )
863
+ if tfjs:
864
+ f[9], _ = export_tfjs(file)
865
+ if paddle: # PaddlePaddle
866
+ f[10], _ = export_paddle(model, im, file, metadata)
867
+
868
+ # Finish
869
+ f = [str(x) for x in f if x] # filter out '' and None
870
+ if any(f):
871
+ cls, det, seg = (
872
+ isinstance(model, x)
873
+ for x in (ClassificationModel, DetectionModel, SegmentationModel)
874
+ ) # type
875
+ det &= (
876
+ not seg
877
+ ) # segmentation models inherit from SegmentationModel(DetectionModel)
878
+ dir = Path("segment" if seg else "classify" if cls else "")
879
+ h = "--half" if half else "" # --half FP16 inference arg
880
+ s = (
881
+ "# WARNING ⚠️ ClassificationModel not yet supported for PyTorch Hub AutoShape inference"
882
+ if cls
883
+ else "# WARNING ⚠️ SegmentationModel not yet supported for PyTorch Hub AutoShape inference"
884
+ if seg
885
+ else ""
886
+ )
887
+ LOGGER.info(
888
+ f"\nExport complete ({time.time() - t:.1f}s)"
889
+ f"\nResults saved to {colorstr('bold', file.parent.resolve())}"
890
+ f"\nDetect: python {dir / ('detect.py' if det else 'predict.py')} --weights {f[-1]} {h}"
891
+ f"\nValidate: python {dir / 'val.py'} --weights {f[-1]} {h}"
892
+ f"\nPyTorch Hub: model = torch.hub.load('ultralytics/yolov5', 'custom', '{f[-1]}') {s}"
893
+ f"\nVisualize: https://netron.app"
894
+ )
895
+ return f # return list of exported files/dirs
896
+
897
+
898
+ def parse_opt():
899
+ parser = argparse.ArgumentParser()
900
+ parser.add_argument(
901
+ "--data",
902
+ type=str,
903
+ default=ROOT / "data/coco128.yaml",
904
+ help="dataset.yaml path",
905
+ )
906
+ parser.add_argument(
907
+ "--weights",
908
+ nargs="+",
909
+ type=str,
910
+ default=ROOT / "yolov5s.pt",
911
+ help="model.pt path(s)",
912
+ )
913
+ parser.add_argument(
914
+ "--imgsz",
915
+ "--img",
916
+ "--img-size",
917
+ nargs="+",
918
+ type=int,
919
+ default=[640, 640],
920
+ help="image (h, w)",
921
+ )
922
+ parser.add_argument("--batch-size", type=int, default=1, help="batch size")
923
+ parser.add_argument(
924
+ "--device", default="cpu", help="cuda device, i.e. 0 or 0,1,2,3 or cpu"
925
+ )
926
+ parser.add_argument(
927
+ "--half", action="store_true", help="FP16 half-precision export"
928
+ )
929
+ parser.add_argument(
930
+ "--inplace",
931
+ action="store_true",
932
+ help="set YOLOv5 Detect() inplace=True",
933
+ )
934
+ parser.add_argument("--keras", action="store_true", help="TF: use Keras")
935
+ parser.add_argument(
936
+ "--optimize",
937
+ action="store_true",
938
+ help="TorchScript: optimize for mobile",
939
+ )
940
+ parser.add_argument(
941
+ "--int8", action="store_true", help="CoreML/TF INT8 quantization"
942
+ )
943
+ parser.add_argument(
944
+ "--dynamic", action="store_true", help="ONNX/TF/TensorRT: dynamic axes"
945
+ )
946
+ parser.add_argument(
947
+ "--simplify", action="store_true", help="ONNX: simplify model"
948
+ )
949
+ parser.add_argument(
950
+ "--opset", type=int, default=17, help="ONNX: opset version"
951
+ )
952
+ parser.add_argument(
953
+ "--verbose", action="store_true", help="TensorRT: verbose log"
954
+ )
955
+ parser.add_argument(
956
+ "--workspace",
957
+ type=int,
958
+ default=4,
959
+ help="TensorRT: workspace size (GB)",
960
+ )
961
+ parser.add_argument(
962
+ "--nms", action="store_true", help="TF: add NMS to model"
963
+ )
964
+ parser.add_argument(
965
+ "--agnostic-nms",
966
+ action="store_true",
967
+ help="TF: add agnostic NMS to model",
968
+ )
969
+ parser.add_argument(
970
+ "--topk-per-class",
971
+ type=int,
972
+ default=100,
973
+ help="TF.js NMS: topk per class to keep",
974
+ )
975
+ parser.add_argument(
976
+ "--topk-all",
977
+ type=int,
978
+ default=100,
979
+ help="TF.js NMS: topk for all classes to keep",
980
+ )
981
+ parser.add_argument(
982
+ "--iou-thres",
983
+ type=float,
984
+ default=0.45,
985
+ help="TF.js NMS: IoU threshold",
986
+ )
987
+ parser.add_argument(
988
+ "--conf-thres",
989
+ type=float,
990
+ default=0.25,
991
+ help="TF.js NMS: confidence threshold",
992
+ )
993
+ parser.add_argument(
994
+ "--include",
995
+ nargs="+",
996
+ default=["torchscript"],
997
+ help="torchscript, onnx, openvino, engine, coreml, saved_model, pb, tflite, edgetpu, tfjs, paddle",
998
+ )
999
+ opt = parser.parse_args()
1000
+ print_args(vars(opt))
1001
+ return opt
1002
+
1003
+
1004
+ def main(opt):
1005
+ for opt.weights in (
1006
+ opt.weights if isinstance(opt.weights, list) else [opt.weights]
1007
+ ):
1008
+ run(**vars(opt))
1009
+
1010
+
1011
+ if __name__ == "__main__":
1012
+ opt = parse_opt()
1013
+ main(opt)
hubconf.py ADDED
@@ -0,0 +1,309 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
2
+ """
3
+ PyTorch Hub models https://pytorch.org/hub/ultralytics_yolov5
4
+
5
+ Usage:
6
+ import torch
7
+ model = torch.hub.load('ultralytics/yolov5', 'yolov5s') # official model
8
+ model = torch.hub.load('ultralytics/yolov5:master', 'yolov5s') # from branch
9
+ model = torch.hub.load('ultralytics/yolov5', 'custom', 'yolov5s.pt') # custom/local model
10
+ model = torch.hub.load('.', 'custom', 'yolov5s.pt', source='local') # local repo
11
+ """
12
+
13
+ import torch
14
+
15
+
16
+ def _create(
17
+ name,
18
+ pretrained=True,
19
+ channels=3,
20
+ classes=80,
21
+ autoshape=True,
22
+ verbose=True,
23
+ device=None,
24
+ ):
25
+ """Creates or loads a YOLOv5 model
26
+
27
+ Arguments:
28
+ name (str): model name 'yolov5s' or path 'path/to/best.pt'
29
+ pretrained (bool): load pretrained weights into the model
30
+ channels (int): number of input channels
31
+ classes (int): number of model classes
32
+ autoshape (bool): apply YOLOv5 .autoshape() wrapper to model
33
+ verbose (bool): print all information to screen
34
+ device (str, torch.device, None): device to use for model parameters
35
+
36
+ Returns:
37
+ YOLOv5 model
38
+ """
39
+ from pathlib import Path
40
+
41
+ from models.common import AutoShape, DetectMultiBackend
42
+ from models.experimental import attempt_load
43
+ from models.yolo import ClassificationModel, DetectionModel, SegmentationModel
44
+ from utils.downloads import attempt_download
45
+ from utils.general import LOGGER, check_requirements, intersect_dicts, logging
46
+ from utils.torch_utils import select_device
47
+
48
+ if not verbose:
49
+ LOGGER.setLevel(logging.WARNING)
50
+ check_requirements(exclude=("opencv-python", "tensorboard", "thop"))
51
+ name = Path(name)
52
+ path = (
53
+ name.with_suffix(".pt")
54
+ if name.suffix == "" and not name.is_dir()
55
+ else name
56
+ ) # checkpoint path
57
+ try:
58
+ device = select_device(device)
59
+ if pretrained and channels == 3 and classes == 80:
60
+ try:
61
+ model = DetectMultiBackend(
62
+ path, device=device, fuse=autoshape
63
+ ) # detection model
64
+ if autoshape:
65
+ if model.pt and isinstance(
66
+ model.model, ClassificationModel
67
+ ):
68
+ LOGGER.warning(
69
+ "WARNING ⚠️ YOLOv5 ClassificationModel is not yet AutoShape compatible. "
70
+ "You must pass torch tensors in BCHW to this model, i.e. shape(1,3,224,224)."
71
+ )
72
+ elif model.pt and isinstance(
73
+ model.model, SegmentationModel
74
+ ):
75
+ LOGGER.warning(
76
+ "WARNING ⚠️ YOLOv5 SegmentationModel is not yet AutoShape compatible. "
77
+ "You will not be able to run inference with this model."
78
+ )
79
+ else:
80
+ model = AutoShape(
81
+ model
82
+ ) # for file/URI/PIL/cv2/np inputs and NMS
83
+ except Exception:
84
+ model = attempt_load(
85
+ path, device=device, fuse=False
86
+ ) # arbitrary model
87
+ else:
88
+ cfg = list(
89
+ (Path(__file__).parent / "models").rglob(f"{path.stem}.yaml")
90
+ )[
91
+ 0
92
+ ] # model.yaml path
93
+ model = DetectionModel(cfg, channels, classes) # create model
94
+ if pretrained:
95
+ ckpt = torch.load(
96
+ attempt_download(path), map_location=device
97
+ ) # load
98
+ csd = (
99
+ ckpt["model"].float().state_dict()
100
+ ) # checkpoint state_dict as FP32
101
+ csd = intersect_dicts(
102
+ csd, model.state_dict(), exclude=["anchors"]
103
+ ) # intersect
104
+ model.load_state_dict(csd, strict=False) # load
105
+ if len(ckpt["model"].names) == classes:
106
+ model.names = ckpt[
107
+ "model"
108
+ ].names # set class names attribute
109
+ if not verbose:
110
+ LOGGER.setLevel(logging.INFO) # reset to default
111
+ return model.to(device)
112
+
113
+ except Exception as e:
114
+ help_url = "https://github.com/ultralytics/yolov5/issues/36"
115
+ s = f"{e}. Cache may be out of date, try `force_reload=True` or see {help_url} for help."
116
+ raise Exception(s) from e
117
+
118
+
119
+ def custom(
120
+ path="path/to/model.pt", autoshape=True, _verbose=True, device=None
121
+ ):
122
+ # YOLOv5 custom or local model
123
+ return _create(path, autoshape=autoshape, verbose=_verbose, device=device)
124
+
125
+
126
+ def yolov5n(
127
+ pretrained=True,
128
+ channels=3,
129
+ classes=80,
130
+ autoshape=True,
131
+ _verbose=True,
132
+ device=None,
133
+ ):
134
+ # YOLOv5-nano model https://github.com/ultralytics/yolov5
135
+ return _create(
136
+ "yolov5n", pretrained, channels, classes, autoshape, _verbose, device
137
+ )
138
+
139
+
140
+ def yolov5s(
141
+ pretrained=True,
142
+ channels=3,
143
+ classes=80,
144
+ autoshape=True,
145
+ _verbose=True,
146
+ device=None,
147
+ ):
148
+ # YOLOv5-small model https://github.com/ultralytics/yolov5
149
+ return _create(
150
+ "yolov5s", pretrained, channels, classes, autoshape, _verbose, device
151
+ )
152
+
153
+
154
+ def yolov5m(
155
+ pretrained=True,
156
+ channels=3,
157
+ classes=80,
158
+ autoshape=True,
159
+ _verbose=True,
160
+ device=None,
161
+ ):
162
+ # YOLOv5-medium model https://github.com/ultralytics/yolov5
163
+ return _create(
164
+ "yolov5m", pretrained, channels, classes, autoshape, _verbose, device
165
+ )
166
+
167
+
168
+ def yolov5l(
169
+ pretrained=True,
170
+ channels=3,
171
+ classes=80,
172
+ autoshape=True,
173
+ _verbose=True,
174
+ device=None,
175
+ ):
176
+ # YOLOv5-large model https://github.com/ultralytics/yolov5
177
+ return _create(
178
+ "yolov5l", pretrained, channels, classes, autoshape, _verbose, device
179
+ )
180
+
181
+
182
+ def yolov5x(
183
+ pretrained=True,
184
+ channels=3,
185
+ classes=80,
186
+ autoshape=True,
187
+ _verbose=True,
188
+ device=None,
189
+ ):
190
+ # YOLOv5-xlarge model https://github.com/ultralytics/yolov5
191
+ return _create(
192
+ "yolov5x", pretrained, channels, classes, autoshape, _verbose, device
193
+ )
194
+
195
+
196
+ def yolov5n6(
197
+ pretrained=True,
198
+ channels=3,
199
+ classes=80,
200
+ autoshape=True,
201
+ _verbose=True,
202
+ device=None,
203
+ ):
204
+ # YOLOv5-nano-P6 model https://github.com/ultralytics/yolov5
205
+ return _create(
206
+ "yolov5n6", pretrained, channels, classes, autoshape, _verbose, device
207
+ )
208
+
209
+
210
+ def yolov5s6(
211
+ pretrained=True,
212
+ channels=3,
213
+ classes=80,
214
+ autoshape=True,
215
+ _verbose=True,
216
+ device=None,
217
+ ):
218
+ # YOLOv5-small-P6 model https://github.com/ultralytics/yolov5
219
+ return _create(
220
+ "yolov5s6", pretrained, channels, classes, autoshape, _verbose, device
221
+ )
222
+
223
+
224
+ def yolov5m6(
225
+ pretrained=True,
226
+ channels=3,
227
+ classes=80,
228
+ autoshape=True,
229
+ _verbose=True,
230
+ device=None,
231
+ ):
232
+ # YOLOv5-medium-P6 model https://github.com/ultralytics/yolov5
233
+ return _create(
234
+ "yolov5m6", pretrained, channels, classes, autoshape, _verbose, device
235
+ )
236
+
237
+
238
+ def yolov5l6(
239
+ pretrained=True,
240
+ channels=3,
241
+ classes=80,
242
+ autoshape=True,
243
+ _verbose=True,
244
+ device=None,
245
+ ):
246
+ # YOLOv5-large-P6 model https://github.com/ultralytics/yolov5
247
+ return _create(
248
+ "yolov5l6", pretrained, channels, classes, autoshape, _verbose, device
249
+ )
250
+
251
+
252
+ def yolov5x6(
253
+ pretrained=True,
254
+ channels=3,
255
+ classes=80,
256
+ autoshape=True,
257
+ _verbose=True,
258
+ device=None,
259
+ ):
260
+ # YOLOv5-xlarge-P6 model https://github.com/ultralytics/yolov5
261
+ return _create(
262
+ "yolov5x6", pretrained, channels, classes, autoshape, _verbose, device
263
+ )
264
+
265
+
266
+ if __name__ == "__main__":
267
+ import argparse
268
+ from pathlib import Path
269
+
270
+ import numpy as np
271
+ from PIL import Image
272
+
273
+ from utils.general import cv2, print_args
274
+
275
+ # Argparser
276
+ parser = argparse.ArgumentParser()
277
+ parser.add_argument(
278
+ "--model", type=str, default="yolov5s", help="model name"
279
+ )
280
+ opt = parser.parse_args()
281
+ print_args(vars(opt))
282
+
283
+ # Model
284
+ model = _create(
285
+ name=opt.model,
286
+ pretrained=True,
287
+ channels=3,
288
+ classes=80,
289
+ autoshape=True,
290
+ verbose=True,
291
+ )
292
+ # model = custom(path='path/to/model.pt') # custom
293
+
294
+ # Images
295
+ imgs = [
296
+ "data/images/zidane.jpg", # filename
297
+ Path("data/images/zidane.jpg"), # Path
298
+ "https://ultralytics.com/images/zidane.jpg", # URI
299
+ cv2.imread("data/images/bus.jpg")[:, :, ::-1], # OpenCV
300
+ Image.open("data/images/bus.jpg"), # PIL
301
+ np.zeros((320, 640, 3)),
302
+ ] # numpy
303
+
304
+ # Inference
305
+ results = model(imgs, size=320) # batched inference
306
+
307
+ # Results
308
+ results.print()
309
+ results.save()
inference.py ADDED
@@ -0,0 +1,226 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
2
+ """
3
+ Run YOLOv5 detection inference on images, videos, directories, globs, YouTube, webcam, streams, etc.
4
+
5
+ Usage - sources:
6
+ $ python detect.py --weights yolov5s.pt --source 0 # webcam
7
+ img.jpg # image
8
+ vid.mp4 # video
9
+ screen # screenshot
10
+ path/ # directory
11
+ list.txt # list of images
12
+ list.streams # list of streams
13
+ 'path/*.jpg' # glob
14
+ 'https://youtu.be/Zgi9g1ksQHc' # YouTube
15
+ 'rtsp://example.com/media.mp4' # RTSP, RTMP, HTTP stream
16
+
17
+ Usage - formats:
18
+ $ python detect.py --weights yolov5s.pt # PyTorch
19
+ yolov5s.torchscript # TorchScript
20
+ yolov5s.onnx # ONNX Runtime or OpenCV DNN with --dnn
21
+ yolov5s_openvino_model # OpenVINO
22
+ yolov5s.engine # TensorRT
23
+ yolov5s.mlmodel # CoreML (macOS-only)
24
+ yolov5s_saved_model # TensorFlow SavedModel
25
+ yolov5s.pb # TensorFlow GraphDef
26
+ yolov5s.tflite # TensorFlow Lite
27
+ yolov5s_edgetpu.tflite # TensorFlow Edge TPU
28
+ yolov5s_paddle_model # PaddlePaddle
29
+ """
30
+
31
+ import argparse
32
+ import os
33
+ import platform
34
+ import sys
35
+ from pathlib import Path
36
+
37
+ import torch
38
+
39
+ FILE = Path(__file__).resolve()
40
+ ROOT = FILE.parents[0] # YOLOv5 root directory
41
+ if str(ROOT) not in sys.path:
42
+ sys.path.append(str(ROOT)) # add ROOT to PATH
43
+ ROOT = Path(os.path.relpath(ROOT, Path.cwd())) # relative
44
+
45
+ from models.common import DetectMultiBackend
46
+ from utils.dataloaders import (
47
+ IMG_FORMATS,
48
+ VID_FORMATS,
49
+ LoadImages,
50
+ LoadScreenshots,
51
+ LoadStreams,
52
+ )
53
+ from utils.general import (
54
+ LOGGER,
55
+ Profile,
56
+ check_file,
57
+ check_img_size,
58
+ check_imshow,
59
+ check_requirements,
60
+ colorstr,
61
+ cv2,
62
+ increment_path,
63
+ non_max_suppression,
64
+ print_args,
65
+ scale_boxes,
66
+ strip_optimizer,
67
+ xyxy2xywh,
68
+ )
69
+ from utils.plots import Annotator, colors, save_one_box
70
+ from utils.torch_utils import select_device, smart_inference_mode
71
+
72
+
73
+ @smart_inference_mode()
74
+ def run(
75
+ weights=ROOT / "yolov5s.pt", # model path or triton URL
76
+ source=ROOT / "data/images", # file/dir/URL/glob/screen/0(webcam)
77
+ data=ROOT / "data/coco128.yaml", # dataset.yaml path
78
+ imgsz=(640, 640), # inference size (height, width)
79
+ conf_thres=0.25, # confidence threshold
80
+ iou_thres=0.45, # NMS IOU threshold
81
+ max_det=1000, # maximum detections per image
82
+ device="", # cuda device, i.e. 0 or 0,1,2,3 or cpu
83
+ view_img=False, # show results
84
+ save_txt=False, # save results to *.txt
85
+ save_conf=False, # save confidences in --save-txt labels
86
+ save_crop=False, # save cropped prediction boxes
87
+ nosave=False, # do not save images/videos
88
+ classes=None, # filter by class: --class 0, or --class 0 2 3
89
+ agnostic_nms=False, # class-agnostic NMS
90
+ augment=False, # augmented inference
91
+ visualize=False, # visualize features
92
+ update=False, # update all models
93
+ project=ROOT / "runs/detect", # save results to project/name
94
+ name="exp", # save results to project/name
95
+ exist_ok=False, # existing project/name ok, do not increment
96
+ line_thickness=3, # bounding box thickness (pixels)
97
+ hide_labels=False, # hide labels
98
+ hide_conf=False, # hide confidences
99
+ half=False, # use FP16 half-precision inference
100
+ dnn=False, # use OpenCV DNN for ONNX inference
101
+ vid_stride=1, # video frame-rate stride
102
+ ):
103
+ source = str(source)
104
+ save_img = not nosave and not source.endswith(
105
+ ".txt"
106
+ ) # save inference images
107
+ is_file = Path(source).suffix[1:] in (IMG_FORMATS + VID_FORMATS)
108
+ is_url = source.lower().startswith(
109
+ ("rtsp://", "rtmp://", "http://", "https://")
110
+ )
111
+ webcam = (
112
+ source.isnumeric()
113
+ or source.endswith(".streams")
114
+ or (is_url and not is_file)
115
+ )
116
+ screenshot = source.lower().startswith("screen")
117
+ if is_url and is_file:
118
+ source = check_file(source) # download
119
+
120
+ # Directories
121
+ save_dir = increment_path(
122
+ Path(project) / name, exist_ok=exist_ok
123
+ ) # increment run
124
+ (save_dir / "labels" if save_txt else save_dir).mkdir(
125
+ parents=True, exist_ok=True
126
+ ) # make dir
127
+
128
+ # Load model
129
+ device = select_device(device)
130
+ model = DetectMultiBackend(
131
+ weights, device=device, dnn=dnn, data=data, fp16=half
132
+ )
133
+ stride, names, pt = model.stride, model.names, model.pt
134
+ imgsz = check_img_size(imgsz, s=stride) # check image size
135
+
136
+ # Dataloader
137
+ bs = 1 # batch_size
138
+ if webcam:
139
+ view_img = check_imshow(warn=True)
140
+ dataset = LoadStreams(
141
+ source,
142
+ img_size=imgsz,
143
+ stride=stride,
144
+ auto=pt,
145
+ vid_stride=vid_stride,
146
+ )
147
+ bs = len(dataset)
148
+ elif screenshot:
149
+ dataset = LoadScreenshots(
150
+ source, img_size=imgsz, stride=stride, auto=pt
151
+ )
152
+ else:
153
+ dataset = LoadImages(
154
+ source,
155
+ img_size=imgsz,
156
+ stride=stride,
157
+ auto=pt,
158
+ vid_stride=vid_stride,
159
+ )
160
+ vid_path, vid_writer = [None] * bs, [None] * bs
161
+
162
+ # Run inference
163
+ model.warmup(imgsz=(1 if pt or model.triton else bs, 3, *imgsz)) # warmup
164
+ seen, windows, dt = 0, [], (Profile(), Profile(), Profile())
165
+ for path, im, im0s, vid_cap, s in dataset:
166
+ with dt[0]:
167
+ im = torch.from_numpy(im).to(model.device)
168
+ im = im.half() if model.fp16 else im.float() # uint8 to fp16/32
169
+ im /= 255 # 0 - 255 to 0.0 - 1.0
170
+ if len(im.shape) == 3:
171
+ im = im[None] # expand for batch dim
172
+
173
+ # Inference
174
+ with dt[1]:
175
+ visualize = (
176
+ increment_path(save_dir / Path(path).stem, mkdir=True)
177
+ if visualize
178
+ else False
179
+ )
180
+ pred = model(im, augment=augment, visualize=visualize)
181
+
182
+ # NMS
183
+ with dt[2]:
184
+ pred = non_max_suppression(
185
+ pred,
186
+ conf_thres,
187
+ iou_thres,
188
+ classes,
189
+ agnostic_nms,
190
+ max_det=max_det,
191
+ )
192
+
193
+ # Second-stage classifier (optional)
194
+ # pred = utils.general.apply_classifier(pred, classifier_model, im, im0s)
195
+
196
+ # Process predictions
197
+ for i, det in enumerate(pred): # per image
198
+ seen += 1
199
+ if webcam: # batch_size >= 1
200
+ p, im0, frame = path[i], im0s[i].copy(), dataset.count
201
+ s += f"{i}: "
202
+ else:
203
+ p, im0, frame = path, im0s.copy(), getattr(dataset, "frame", 0)
204
+
205
+ p = Path(p) # to Path
206
+ save_path = str(save_dir / p.name) # im.jpg
207
+ txt_path = str(save_dir / "labels" / p.stem) + (
208
+ "" if dataset.mode == "image" else f"_{frame}"
209
+ ) # im.txt
210
+ s += "%gx%g " % im.shape[2:] # print string
211
+ gn = torch.tensor(im0.shape)[
212
+ [1, 0, 1, 0]
213
+ ] # normalization gain whwh
214
+ imc = im0.copy() if save_crop else im0 # for save_crop
215
+ annotator = Annotator(
216
+ im0, line_width=line_thickness, example=str(names)
217
+ )
218
+ results = []
219
+ if len(det):
220
+ # Rescale boxes from img_size to im0 size
221
+ det[:, :4] = scale_boxes(
222
+ im.shape[2:], det[:, :4], im0.shape
223
+ ).round()
224
+ results.append((path, det))
225
+
226
+ return results
models/__init__.py ADDED
File without changes
models/__pycache__/__init__.cpython-310.pyc ADDED
Binary file (131 Bytes). View file
 
models/__pycache__/__init__.cpython-37.pyc ADDED
Binary file (129 Bytes). View file
 
models/__pycache__/__init__.cpython-38.pyc ADDED
Binary file (127 Bytes). View file
 
models/__pycache__/__init__.cpython-39.pyc ADDED
Binary file (133 Bytes). View file
 
models/__pycache__/common.cpython-310.pyc ADDED
Binary file (36.9 kB). View file