Yolo V5 Tutorial - Training a Custom Object Detection Model

1. How to set up your environment to train a Yolo V5 object detection model?

In [1]:
!git clone https://github.com/ultralytics/yolov5  # clone repo
!pip install -r yolov5/requirements.txt  # install dependencies
%cd yolov5
In [2]:
import torch
print('Setup complete. Using torch %s %s' % (torch.__version__, torch.cuda.get_device_properties(0) if torch.cuda.is_available() else 'CPU'))
Setup complete. Using torch 1.6.0 CPU
In [4]:
from IPython.display import Image
import pandas as pd
import random
import os
from shutil import copyfile

2. How to set up the data and directories for training a Yolo V5 object detection model?

In [8]:
train = pd.read_csv('train.csv')
train
Out[8]:
image_id width height bbox source
0 b6ab77fd7 1024 1024 [834.0, 222.0, 56.0, 36.0] usask_1
1 b6ab77fd7 1024 1024 [226.0, 548.0, 130.0, 58.0] usask_1
2 b6ab77fd7 1024 1024 [377.0, 504.0, 74.0, 160.0] usask_1
3 b6ab77fd7 1024 1024 [834.0, 95.0, 109.0, 107.0] usask_1
4 b6ab77fd7 1024 1024 [26.0, 144.0, 124.0, 117.0] usask_1
... ... ... ... ... ...
147788 5e0747034 1024 1024 [64.0, 619.0, 84.0, 95.0] arvalis_2
147789 5e0747034 1024 1024 [292.0, 549.0, 107.0, 82.0] arvalis_2
147790 5e0747034 1024 1024 [134.0, 228.0, 141.0, 71.0] arvalis_2
147791 5e0747034 1024 1024 [430.0, 13.0, 184.0, 79.0] arvalis_2
147792 5e0747034 1024 1024 [875.0, 740.0, 94.0, 61.0] arvalis_2

147793 rows × 5 columns

Make train and val directory and inside them, make images and labels folder

In [10]:
## Make directory structure

os.mkdir('/working/data')

os.mkdir('/working/data/images')
os.mkdir('/working/data/labels')

os.mkdir('/working/data/images/train')
os.mkdir('/working/data/images/valid')

os.mkdir('/working/data/labels/train')
os.mkdir('/working/data/labels/valid')
In [12]:
# copy images + labels to the training location, with 30% validation data
# can be done by hand as well

for image_id in list(set(train.image_id)):
    image_data = train[train['image_id'] == image_id]
    image_bboxes = image_data['bbox']
    image_bboxes = image_bboxes.apply(lambda x: x.strip('[').strip(']').split(', '))
    
    if random.random() > 0.3:
        path = 'train'
    else:
        path = 'valid'
    
    with open('/working/data/labels/{}/'.format(path) + image_id + '.txt', 'w+') as file:
        for bbox in image_bboxes:
            xc, yc, w, h = bbox
            
            x_center_n = (float(xc) + float(w) / 2) / 1024.
            y_center_n = (float(yc) + float(h) / 2) / 1024.
            width_n = float(w) / 1024.
            height_n = float(h) / 1024.
            line = ' '.join(('0', str(x_center_n), str(y_center_n), str(width_n), str(height_n))) + '\n'
            file.write(line)
                        
    copyfile('/input/global-wheat-detection/train/' + image_id + '.jpg', '/working/data/images/{}/'.format(path) +  image_id + '.jpg')
    

3. How to configure the YAML files for training a Yolo V5 Object Detection Model?

The first one is easy, it is simply a copy of the yolo s (small), but with nc = 1, because we have only 1 class

In [6]:
with open('/working/new_train_yaml', 'w+') as file:
    file.write(
        """
        # parameters
        nc: 1  # number of classes
        depth_multiple: 0.33  # model depth multiple
        width_multiple: 0.50  # layer channel multiple

        # anchors
        anchors:
          - [10,13, 16,30, 33,23]  # P3/8
          - [30,61, 62,45, 59,119]  # P4/16
          - [116,90, 156,198, 373,326]  # P5/32

        # YOLOv5 backbone
        backbone:
          # [from, number, module, args]
          [[-1, 1, Focus, [64, 3]],  # 0-P1/2
           [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4
           [-1, 3, BottleneckCSP, [128]],
           [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8
           [-1, 9, BottleneckCSP, [256]],
           [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16
           [-1, 9, BottleneckCSP, [512]],
           [-1, 1, Conv, [1024, 3, 2]],  # 7-P5/32
           [-1, 1, SPP, [1024, [5, 9, 13]]],
           [-1, 3, BottleneckCSP, [1024, False]],  # 9
          ]

        # YOLOv5 head
        head:
          [[-1, 1, Conv, [512, 1, 1]],
           [-1, 1, nn.Upsample, [None, 2, 'nearest']],
           [[-1, 6], 1, Concat, [1]],  # cat backbone P4
           [-1, 3, BottleneckCSP, [512, False]],  # 13

           [-1, 1, Conv, [256, 1, 1]],
           [-1, 1, nn.Upsample, [None, 2, 'nearest']],
           [[-1, 4], 1, Concat, [1]],  # cat backbone P3
           [-1, 3, BottleneckCSP, [256, False]],  # 17 (P3/8-small)

           [-1, 1, Conv, [256, 3, 2]],
           [[-1, 14], 1, Concat, [1]],  # cat head P4
           [-1, 3, BottleneckCSP, [512, False]],  # 20 (P4/16-medium)

           [-1, 1, Conv, [512, 3, 2]],
           [[-1, 10], 1, Concat, [1]],  # cat head P5
           [-1, 3, BottleneckCSP, [1024, False]],  # 23 (P5/32-large)

           [[17, 20, 23], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
          ]
        """
    )

The second one is also easy, but it has to be in accordance with the location of the data.

In [7]:
with open('/working/new_data_yaml', 'w+') as file:
    file.write(
        """
        train: /working/data/images/train
        val: /working/data/images/valid

        nc: 1
        names: ['wheat']
        """
    )

4. How to train your custom YoloV5 model?

In [ ]:
os.chdir('/working/yolov5')
In [ ]:
%%time
!python train.py --img 400 --batch 16 --epochs 3 --data '/working/new_data_yaml' --cfg '/working/new_train_yaml' --weights '' --name joos --nosave --cache

5. How to use your custom Yolo V5 model for object detection on new data?

In [23]:
# find the correct weigths 1
!ls /working/yolov5/runs/
exp0_joos
In [24]:
# find the correct weigths 2
!ls /working/yolov5/runs/exp0_joos
events.out.tfevents.1597052916.c32dae74c190.53.0  test_batch0_gt.jpg
hyp.yaml					  test_batch0_pred.jpg
labels.png					  train_batch0.jpg
opt.yaml					  train_batch1.jpg
results.png					  train_batch2.jpg
results.txt					  weights
In [25]:
# find the correct weigths 3
!ls /working/yolov5/runs/exp0_joos/weights
best_joos.pt  last_joos.pt
In [ ]:
# use you weigths in the detection
!python detect.py --source /input/global-wheat-detection/test --weights 'runs/exp0_joos/weights/last_joos.pt' --img 416 --conf 0.5 --save-txt
In [27]:
# find the output: annotated images and text files with the bounding box locations
!ls /working/yolov5/inference/output/
2fd875eaa.jpg  51b3e36ab.jpg  53f253011.jpg  aac893a91.txt  cc3532ff6.txt
2fd875eaa.txt  51b3e36ab.txt  53f253011.txt  cb8d261a3.jpg  f5a1f0358.jpg
348a992bb.jpg  51f1be19e.jpg  796707dd7.jpg  cb8d261a3.txt  f5a1f0358.txt
348a992bb.txt  51f1be19e.txt  aac893a91.jpg  cc3532ff6.jpg