How To modified YOLO
To facilitate easy customization of the YOLO model, we've structured the codebase to allow for changes through configuration files and minimal code adjustments. This guide will walk you through the steps to customize various components of the model including the architecture, blocks, data loaders, and loss functions.
Custom Model Architecture
You can change the model architecture simply by modifying the YAML configuration file. Here's how:
Modify Architecture in Config:
Navigate to your model's configuration file (typically formate like
yolo/config/model/v9-c.yaml).- Adjust the architecture settings under the
architecturesection. Ensure that every module you reference exists inmodule.py, or refer to the next section on how to add new modules.
model: foo: - ADown: args: {out_channels: 256} - RepNCSPELAN: source: -2 args: {out_channels: 512, part_channels: 256} tags: B4 bar: - Concat: source: [-2, B4]tags: Use this to labels any module you want, and could be the module source.source: Set this to the index of the module output you wish to use as input; default is-1which refers to the last module's output. Capable tags, relative position, absolute positionargs: A dictionary used to initialize parameters for convolutional or bottleneck layers.output: Whether to serve as the output of the model.- Adjust the architecture settings under the
Custom Block
To add or modify a block in the model:
Create a New Module:
Define a new class in
module.pythat inherits fromnn.Module.The constructor should accept
in_channelsas a parameter. Make sure to calculateout_channelsbased on your model's requirements or configure it through the YAML file usingargs.class CustomBlock(nn.Module): def __init__(self, in_channels, out_channels, **kwargs): super().__init__() self.module = # conv, bool, ... def forward(self, x): return self.module(x)Reference in Config:
... - CustomBlock: args: {out_channels: int, etc: ...} ... ...
Custom Data Augmentation
Custom transformations should be designed to accept an image and its bounding boxes, and return them after applying the desired changes. Hereโs how you can define such a transformation:
Define Dataset:
Your class must have a
__call__method that takes a PIL image and its corresponding bounding boxes as input, and returns them after processing.class CustomTransform: def __init__(self, prob=0.5): self.prob = prob def __call__(self, image, boxes): return image, boxesUpdate CustomTransform in Config:
Specify your custom transformation in a YAML config
yolo/config/data/augment.yaml. For examples:Mosaic: 1 # ... (Other Transform) CustomTransform: 0.5
- Utils
- bbox_utils
classAnchor2Box: transform predicted anchor to bounding boxclassMatcher: given prediction and groudtruth, find the groundtruth for each predictionfunccalculate_iou: calculate iou for given two list of bboxfunctransform_bbox: transform bbox from {xywh, xyxy, xcycwh} to {xywh, xyxy, xcycwh}funcgenerate_anchors: given image size, make the anchor point for the given size
- dataset_utils
funclocate_label_paths:funccreate_image_metadata:funcorganize_annotations_by_image:funcscale_segmentation:
- logging_utils
funccustom_log: custom loguru, overiding the origin loggerclassProgressTracker: A class to handle output for each batch, epochfunclog_model_structure: give a torch model, print it as a tablefuncvalidate_log_directory: for given experiemnt, check if the log folder already existed
- model_utils
classExponentialMovingAverage: a mirror of model, do ema on modelfunccreate_optimizer: return a optimzer, for example SDG, ADAMfunccreate_scheduler: return a scheduler, for example Step, Lambda
- module_utils
funcget_layer_map:funcauto_pad: given a convolution block, return how many pixel should conv paddingfunccreate_activation_function: given afuncname, return a activationfunctionfuncround_up: given number and divider, return a number is mutliplcation of dividerfuncdivide_into_chunks: for a given list and n, seperate list to n sub list
- trainer
classTrainer: a class can automatic train the model
- bbox_utils
- Tools
- converter_json2txt
funcdiscretize_categories: given the dictionary class, turn id from 1: classfuncprocess_annotations: handle the whole dataset annotationsfuncprocess_annotation: handle a annotation(a list of bounding box)funcnormalize_segmentation: normalize segmentation position to 0~1funcconvert_annotations: convert json annotations to txt file structure
- data_augment
classAugmentationComposer: Compose a list of data augmentation strategyclassVerticalFlip: a custom data augmentation, Random Vertical FlipclassMosaic: a data augmentation strategy, follow YOLOv5
- dataloader
classYoloDataset: a custom dataset for training yolo's modelclassYoloDataLoader: a dataloader base on torch's dataloader, with custom allocate functionfunccreate_dataloader: given a config file, return a YOLO dataloader
- drawer
funcdraw_bboxes: given a image and list of bbox, draw bbox on the imagefuncdraw_model: visualize the given model
- get_dataset
funcdownload_file: for a given link, downlaod the filefuncunzip_file: unzip the downlaoded zip to data/funccheck_files: check if the dataset file numbers is correctfuncprepare_dataset: automatic downlaod the dataset and check if it is correct
- loss
classBoxLoss: a Custom Loss for bounding boxclassYOLOLoss: a implementation of yolov9 lossclassDualLoss: a implementation of yolov9 loss with auxiliary detection head
- converter_json2txt