Yolo V5 Tutorial - Training a Custom Object Detection Model¶

1. How to set up your environment to train a Yolo V5 object detection model?¶

!git clone https://github.com/ultralytics/yolov5  # clone repo
!pip install -r yolov5/requirements.txt  # install dependencies
%cd yolov5

import torch
print('Setup complete. Using torch %s %s' % (torch.__version__, torch.cuda.get_device_properties(0) if torch.cuda.is_available() else 'CPU'))

Setup complete. Using torch 1.6.0 CPU

from IPython.display import Image
import pandas as pd
import random
import os
from shutil import copyfile

2. How to set up the data and directories for training a Yolo V5 object detection model?¶

train = pd.read_csv('train.csv')
train

Make train and val directory and inside them, make images and labels folder¶

## Make directory structure

os.mkdir('/working/data')

os.mkdir('/working/data/images')
os.mkdir('/working/data/labels')

os.mkdir('/working/data/images/train')
os.mkdir('/working/data/images/valid')

os.mkdir('/working/data/labels/train')
os.mkdir('/working/data/labels/valid')

# copy images + labels to the training location, with 30% validation data
# can be done by hand as well

for image_id in list(set(train.image_id)):
    image_data = train[train['image_id'] == image_id]
    image_bboxes = image_data['bbox']
    image_bboxes = image_bboxes.apply(lambda x: x.strip('[').strip(']').split(', '))
    
    if random.random() > 0.3:
        path = 'train'
    else:
        path = 'valid'
    
    with open('/working/data/labels/{}/'.format(path) + image_id + '.txt', 'w+') as file:
        for bbox in image_bboxes:
            xc, yc, w, h = bbox
            
            x_center_n = (float(xc) + float(w) / 2) / 1024.
            y_center_n = (float(yc) + float(h) / 2) / 1024.
            width_n = float(w) / 1024.
            height_n = float(h) / 1024.
            line = ' '.join(('0', str(x_center_n), str(y_center_n), str(width_n), str(height_n))) + '\n'
            file.write(line)
                        
    copyfile('/input/global-wheat-detection/train/' + image_id + '.jpg', '/working/data/images/{}/'.format(path) +  image_id + '.jpg')

3. How to configure the YAML files for training a Yolo V5 Object Detection Model?¶

The first one is easy, it is simply a copy of the yolo s (small), but with nc = 1, because we have only 1 class

with open('/working/new_train_yaml', 'w+') as file:
    file.write(
        """
        # parameters
        nc: 1  # number of classes
        depth_multiple: 0.33  # model depth multiple
        width_multiple: 0.50  # layer channel multiple

        # anchors
        anchors:
          - [10,13, 16,30, 33,23]  # P3/8
          - [30,61, 62,45, 59,119]  # P4/16
          - [116,90, 156,198, 373,326]  # P5/32

        # YOLOv5 backbone
        backbone:
          # [from, number, module, args]
          [[-1, 1, Focus, [64, 3]],  # 0-P1/2
           [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4
           [-1, 3, BottleneckCSP, [128]],
           [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8
           [-1, 9, BottleneckCSP, [256]],
           [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16
           [-1, 9, BottleneckCSP, [512]],
           [-1, 1, Conv, [1024, 3, 2]],  # 7-P5/32
           [-1, 1, SPP, [1024, [5, 9, 13]]],
           [-1, 3, BottleneckCSP, [1024, False]],  # 9
          ]

        # YOLOv5 head
        head:
          [[-1, 1, Conv, [512, 1, 1]],
           [-1, 1, nn.Upsample, [None, 2, 'nearest']],
           [[-1, 6], 1, Concat, [1]],  # cat backbone P4
           [-1, 3, BottleneckCSP, [512, False]],  # 13

           [-1, 1, Conv, [256, 1, 1]],
           [-1, 1, nn.Upsample, [None, 2, 'nearest']],
           [[-1, 4], 1, Concat, [1]],  # cat backbone P3
           [-1, 3, BottleneckCSP, [256, False]],  # 17 (P3/8-small)

           [-1, 1, Conv, [256, 3, 2]],
           [[-1, 14], 1, Concat, [1]],  # cat head P4
           [-1, 3, BottleneckCSP, [512, False]],  # 20 (P4/16-medium)

           [-1, 1, Conv, [512, 3, 2]],
           [[-1, 10], 1, Concat, [1]],  # cat head P5
           [-1, 3, BottleneckCSP, [1024, False]],  # 23 (P5/32-large)

           [[17, 20, 23], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
          ]
        """
    )

The second one is also easy, but it has to be in accordance with the location of the data.

with open('/working/new_data_yaml', 'w+') as file:
    file.write(
        """
        train: /working/data/images/train
        val: /working/data/images/valid

        nc: 1
        names: ['wheat']
        """
    )

4. How to train your custom YoloV5 model?¶

os.chdir('/working/yolov5')

%%time
!python train.py --img 400 --batch 16 --epochs 3 --data '/working/new_data_yaml' --cfg '/working/new_train_yaml' --weights '' --name joos --nosave --cache

5. How to use your custom Yolo V5 model for object detection on new data?¶

# find the correct weigths 1
!ls /working/yolov5/runs/

exp0_joos

# find the correct weigths 2
!ls /working/yolov5/runs/exp0_joos

events.out.tfevents.1597052916.c32dae74c190.53.0  test_batch0_gt.jpg
hyp.yaml					  test_batch0_pred.jpg
labels.png					  train_batch0.jpg
opt.yaml					  train_batch1.jpg
results.png					  train_batch2.jpg
results.txt					  weights

# find the correct weigths 3
!ls /working/yolov5/runs/exp0_joos/weights

best_joos.pt  last_joos.pt

# use you weigths in the detection
!python detect.py --source /input/global-wheat-detection/test --weights 'runs/exp0_joos/weights/last_joos.pt' --img 416 --conf 0.5 --save-txt

# find the output: annotated images and text files with the bounding box locations
!ls /working/yolov5/inference/output/

2fd875eaa.jpg  51b3e36ab.jpg  53f253011.jpg  aac893a91.txt  cc3532ff6.txt
2fd875eaa.txt  51b3e36ab.txt  53f253011.txt  cb8d261a3.jpg  f5a1f0358.jpg
348a992bb.jpg  51f1be19e.jpg  796707dd7.jpg  cb8d261a3.txt  f5a1f0358.txt
348a992bb.txt  51f1be19e.txt  aac893a91.jpg  cc3532ff6.jpg

	image_id	width	height	bbox	source
0	b6ab77fd7	1024	1024	[834.0, 222.0, 56.0, 36.0]	usask_1
1	b6ab77fd7	1024	1024	[226.0, 548.0, 130.0, 58.0]	usask_1
2	b6ab77fd7	1024	1024	[377.0, 504.0, 74.0, 160.0]	usask_1
3	b6ab77fd7	1024	1024	[834.0, 95.0, 109.0, 107.0]	usask_1
4	b6ab77fd7	1024	1024	[26.0, 144.0, 124.0, 117.0]	usask_1
...	...	...	...	...	...
147788	5e0747034	1024	1024	[64.0, 619.0, 84.0, 95.0]	arvalis_2
147789	5e0747034	1024	1024	[292.0, 549.0, 107.0, 82.0]	arvalis_2
147790	5e0747034	1024	1024	[134.0, 228.0, 141.0, 71.0]	arvalis_2
147791	5e0747034	1024	1024	[430.0, 13.0, 184.0, 79.0]	arvalis_2
147792	5e0747034	1024	1024	[875.0, 740.0, 94.0, 61.0]	arvalis_2