Genetic Algorithm from Scratch - Applied to Weekly Rotating Staff Planning

In [1]:
import numpy as np
import pandas as pd

Staff Planning

  • The goal of this code is to show a genetic algorithm from scratch, while applying it to a business goal.
  • Since we are working on an applied business problem, we start from the shape of the staff planning that is required.
  • We work with a staff planning in which all employees work every weekday, 5 days a week and the weekend our shop is closed. This removes any difficulties on handling days off for the example.
  • A shift is given by a starting time and a shift duration. Breaks are ignored.
  • We have data on the number of staff needed per hour.
  • An employee can be planned to not work on a certain day.

Two Shapes of a same planning

There are two shapes of the planning: firstly a planning of start time and duration per employee per day, and secondly the operational planning with the total number of employees present for each hour of each day. This second planning is a regrouped version of the first one.

Staff Planning shape1 - Planning for employees

  • The staff planning is represented as a list per day (5 days).
  • Each day consits of many lists of length 3.
  • Each list of 3 is an employee with the following items: (staff id, starting time, shift duration)
  • The number of lists is the number of employees that are possibly available on that day.
  • In the below example staff planning you see that there are:
    • 11 employees in our case (id 0 to 10)
    • all start times are 0h (midnight)
    • all durations are 10 hours
In [2]:
staff_planning = [
    [[0, 0, 10],[1, 0, 10],[2, 0, 10],[3, 0, 10],[4, 0, 10],[5, 0, 10],[6, 0, 10],[7, 0, 10],[8, 0, 10],[9, 0, 10],[10, 0, 10]],
    [[0, 0, 10],[1, 0, 10],[2, 0, 10],[3, 0, 10],[4, 0, 10],[5, 0, 10],[6, 0, 10],[7, 0, 10],[8, 0, 10],[9, 0, 10],[10, 0, 10]],
    [[0, 0, 10],[1, 0, 10],[2, 0, 10],[3, 0, 10],[4, 0, 10],[5, 0, 10],[6, 0, 10],[7, 0, 10],[8, 0, 10],[9, 0, 10],[10, 0, 10]],
    [[0, 0, 10],[1, 0, 10],[2, 0, 10],[3, 0, 10],[4, 0, 10],[5, 0, 10],[6, 0, 10],[7, 0, 10],[8, 0, 10],[9, 0, 10],[10, 0, 10]],
    [[0, 0, 10],[1, 0, 10],[2, 0, 10],[3, 0, 10],[4, 0, 10],[5, 0, 10],[6, 0, 10],[7, 0, 10],[8, 0, 10],[9, 0, 10],[10, 0, 10]]
]

Staff Planning Shape 2 - Planning for shop

  • In order to optimize the above staff planning, we need to have info on what would be a perfect planning.
  • Assume that based on previous days, we know how many staf we need each hour.
  • The staff needed is in the following shape:
    • list of days
    • each day is a list of 24 hours, with the number of employees needed every hour
In [3]:
hourlystaff_needed = np.array([
    [0, 0, 0, 0, 0, 0, 4, 4, 4, 2, 2, 2, 6, 6, 2, 2, 2, 6, 6, 6, 2, 2, 2, 2],
    [0, 0, 0, 0, 0, 0, 4, 4, 4, 2, 2, 2, 6, 6, 2, 2, 2, 6, 6, 6, 2, 2, 2, 2],
    [0, 0, 0, 0, 0, 0, 4, 4, 4, 2, 2, 2, 6, 6, 2, 2, 2, 6, 6, 6, 2, 2, 2, 2],
    [0, 0, 0, 0, 0, 0, 4, 4, 4, 2, 2, 2, 6, 6, 2, 2, 2, 6, 6, 6, 2, 2, 2, 2],
    [0, 0, 0, 0, 0, 0, 4, 4, 4, 2, 2, 2, 6, 6, 2, 2, 2, 6, 6, 6, 2, 2, 2, 2]
])
    

Conversion from shape 1 (shifts) to shape 2 (staff-per-hour)

  • In the optimization, the genetic algorithm will generate will change iteratively the starting times and the durations.
  • Then, it is converted into number of employees per hour
  • Then it is measured how far away this is from the staff-needed planning.
  • In order to do this easily, we need a function to convert one type of planning into the other one easily. Below is the code for this.
In [4]:
"""
Employee Present: analyse whether the employee is present yes or no on a given time
Based on the employee list of 3 (id, start time, duration)
"""
def employee_present(employee, time):
    employee_start_time = employee[1]
    employee_duration = employee[2]
    employee_end_time = employee_start_time + employee_duration
    if (time >= employee_start_time) and (time < employee_end_time):
        return True
    return False
In [5]:
"""
convert a staff planning to a staff-needed plannig
The employee planning is organised per employee, the staff-needed planning is the number of employees working per hour
The staff-needed planning is based on the employee planning and will allow to calculate the difference with the staff-needed
It doesnt work overnight, but our shop isnt open at night anyway
"""
def staffplanning_to_hourlyplanning(staff_planning):
    
    hourlystaff_week = []
    for day in staff_planning:
        
        hourlystaff_day = [] 
        for employee in day:
            
            employee_present_hour = []
            for time in range(0, 24):
                
                employee_present_hour.append(employee_present(employee, time))
                
            hourlystaff_day.append(employee_present_hour)
            
        hourlystaff_week.append(hourlystaff_day)
        
    hourlystaff_week = np.array(hourlystaff_week).sum(axis = 1)  
    return hourlystaff_week
        

Code for evaluating a staff planning based on a list of staff needed

In [6]:
"""
the cost is calculated as hours understaffed + hours overstaffed
"""
def cost(hourlystaff, hourlystaff_needed):
    errors = hourlystaff - hourlystaff_needed
    overstaff = abs(errors[errors > 0].sum())
    understaff = abs(errors[errors < 0].sum())
    
    overstaff_cost = 1
    understaff_cost = 1
    
    cost = overstaff_cost * overstaff + understaff_cost * understaff
    return cost
    

Code for the genetic algorithm

The random initiamlization

  • Here is the code that randomly initializes a planning.
  • There are a fixed number of days and staff
  • The start time of each employee of each day is a random choice between 0 and 23
  • The duration of each employee of each day is a random choice between 0 and 10
In [7]:
"""
generate an entirely random staff planning for a certain number of days
start time is random between 0 and 23; duration is random between 0 and 10
"""
def generate_random_staff_planning(n_days, n_staff):
    period_planning = []
    for day in range(n_days):
        day_planning = []
        for employee_id in range(n_staff):
            start_time = np.random.randint(0, 23)
            duration = np.random.randint(0, 10)
            employee = [employee_id, start_time, duration]
            day_planning.append(employee)

        period_planning.append(day_planning)
            
    return period_planning
        
In [8]:
# An example of the code until here

# show the random initialization of the week planning
random_staff_planning = generate_random_staff_planning(n_days = 5, n_staff = 11)
random_staff_planning

# show the cost of this random week planning
cost(staffplanning_to_hourlyplanning(random_staff_planning), hourlystaff_needed)
Out[8]:
214

Define Genetics

Define Genetics 1 - Create Generation

In [9]:
"""
create a parent generation of n parent plannings
"""
def create_parent_generation(n_parents, n_days = 7, n_staff = 11):
    parents = []
    for i in range(n_parents):
        parent = generate_random_staff_planning(n_days = n_days, n_staff = n_staff)
        parents.append(parent)
    return parents

Define Genetics 2 - Cross Over / Combination

In [10]:
"""
for each iteration, select randomly two parents and make a random combination of those two parents
by applying a randomly generated yes/no mask to the two selected parents
"""
def random_combine(parents, n_offspring):
    n_parents = len(parents)
    n_periods = len(parents[0])
    n_employees = len(parents[0][0])
    
    offspring = []
    for i in range(n_offspring):
        random_dad = parents[np.random.randint(low = 0, high = n_parents - 1)]
        random_mom = parents[np.random.randint(low = 0, high = n_parents - 1)]
        
        dad_mask = np.random.randint(0, 2, size = np.array(random_dad).shape)
        mom_mask = np.logical_not(dad_mask)
        
        child = np.add(np.multiply(random_dad, dad_mask), np.multiply(random_mom, mom_mask))

        offspring.append(child)
    return offspring

Define Genetics 3 - Mutation

In [11]:
def mutate_parent(parent, n_mutations):
    size1 = parent.shape[0]
    size2 = parent.shape[1]
    
    for i in range(n_mutations):

        rand1 = np.random.randint(0, size1)
        rand2 = np.random.randint(0, size2)
        rand3 = np.random.randint(1, 2)

        parent[rand1,rand2,rand3] = np.random.randint(0, 10)

    return parent

def mutate_gen(parent_gen, n_mutations):
    mutated_parent_gen = []
    for parent in parent_gen:
        mutated_parent_gen.append(mutate_parent(parent, n_mutations))
    return mutated_parent_gen

Define Genetics 4 - Selection - Feasibility

In [12]:
def is_acceptable(parent):
    return np.logical_not((np.array(parent)[:,:,2:] > 10).any()) #work more than 10 hours is not ok

def select_acceptable(parent_gen):
    parent_gen = [parent for parent in parent_gen if is_acceptable(parent)]
    return parent_gen

Define Genetics 5 - Selection - Cost (inverse fitness)

In [13]:
def select_best(parent_gen, hourlystaff_needed, n_best):
    costs = []
    for idx, parent_staff_planning in enumerate(parent_gen):
        
        parent_hourly_planning = staffplanning_to_hourlyplanning(parent_staff_planning)
        parent_cost = cost(parent_hourly_planning, hourlystaff_needed)
        costs.append([idx, parent_cost])
    
    print('generations best is: {}, generations worst is: {}'.format(pd.DataFrame(costs)[1].min(), pd.DataFrame(costs)[1].max()))
    
    costs_tmp = pd.DataFrame(costs).sort_values(by = 1, ascending = True).reset_index(drop=True)
    selected_parents_idx = list(costs_tmp.iloc[:n_best,0])
    selected_parents = [parent for idx, parent in enumerate(parent_gen) if idx in selected_parents_idx]
    
    return selected_parents

The iteration - the complete algorithm

In [14]:
"""
the overall function
"""
def gen_algo(hourlystaff_needed, n_iterations):   
    
    generation_size = 500
    
    parent_gen = create_parent_generation(n_parents = generation_size, n_days = 5, n_staff = 11)
    for it in range(n_iterations):
        parent_gen = select_acceptable(parent_gen)
        parent_gen = select_best(parent_gen, hourlystaff_needed, n_best = 100)
        parent_gen = random_combine(parent_gen, n_offspring = generation_size)
        parent_gen = mutate_gen(parent_gen, n_mutations = 1)
    
    best_child = select_best(parent_gen, hourlystaff_needed, n_best = 1)
    return best_child

Execute all

In [15]:
best_planning = gen_algo(hourlystaff_needed, n_iterations = 100)
generations best is: 153, generations worst is: 264
generations best is: 148, generations worst is: 239
generations best is: 154, generations worst is: 229
generations best is: 144, generations worst is: 216
generations best is: 143, generations worst is: 213
generations best is: 138, generations worst is: 212
generations best is: 137, generations worst is: 197
generations best is: 124, generations worst is: 199
generations best is: 125, generations worst is: 199
generations best is: 121, generations worst is: 191
generations best is: 120, generations worst is: 189
generations best is: 112, generations worst is: 185
generations best is: 110, generations worst is: 179
generations best is: 111, generations worst is: 169
generations best is: 106, generations worst is: 174
generations best is: 104, generations worst is: 170
generations best is: 101, generations worst is: 179
generations best is: 96, generations worst is: 170
generations best is: 96, generations worst is: 164
generations best is: 92, generations worst is: 168
generations best is: 91, generations worst is: 165
generations best is: 86, generations worst is: 155
generations best is: 91, generations worst is: 159
generations best is: 85, generations worst is: 151
generations best is: 86, generations worst is: 143
generations best is: 81, generations worst is: 140
generations best is: 83, generations worst is: 140
generations best is: 79, generations worst is: 134
generations best is: 80, generations worst is: 131
generations best is: 77, generations worst is: 124
generations best is: 73, generations worst is: 130
generations best is: 67, generations worst is: 127
generations best is: 67, generations worst is: 125
generations best is: 67, generations worst is: 116
generations best is: 65, generations worst is: 118
generations best is: 64, generations worst is: 114
generations best is: 60, generations worst is: 116
generations best is: 62, generations worst is: 109
generations best is: 53, generations worst is: 115
generations best is: 58, generations worst is: 105
generations best is: 53, generations worst is: 111
generations best is: 53, generations worst is: 100
generations best is: 52, generations worst is: 99
generations best is: 50, generations worst is: 94
generations best is: 50, generations worst is: 107
generations best is: 49, generations worst is: 87
generations best is: 49, generations worst is: 95
generations best is: 47, generations worst is: 92
generations best is: 41, generations worst is: 98
generations best is: 44, generations worst is: 89
generations best is: 45, generations worst is: 87
generations best is: 44, generations worst is: 88
generations best is: 45, generations worst is: 81
generations best is: 42, generations worst is: 93
generations best is: 42, generations worst is: 80
generations best is: 42, generations worst is: 78
generations best is: 39, generations worst is: 77
generations best is: 37, generations worst is: 74
generations best is: 39, generations worst is: 75
generations best is: 39, generations worst is: 70
generations best is: 38, generations worst is: 73
generations best is: 40, generations worst is: 70
generations best is: 36, generations worst is: 70
generations best is: 38, generations worst is: 69
generations best is: 36, generations worst is: 65
generations best is: 35, generations worst is: 67
generations best is: 35, generations worst is: 64
generations best is: 35, generations worst is: 62
generations best is: 35, generations worst is: 64
generations best is: 35, generations worst is: 63
generations best is: 35, generations worst is: 62
generations best is: 35, generations worst is: 59
generations best is: 35, generations worst is: 61
generations best is: 35, generations worst is: 61
generations best is: 35, generations worst is: 63
generations best is: 35, generations worst is: 60
generations best is: 35, generations worst is: 65
generations best is: 35, generations worst is: 65
generations best is: 35, generations worst is: 59
generations best is: 35, generations worst is: 61
generations best is: 35, generations worst is: 58
generations best is: 35, generations worst is: 61
generations best is: 35, generations worst is: 61
generations best is: 35, generations worst is: 60
generations best is: 35, generations worst is: 61
generations best is: 35, generations worst is: 61
generations best is: 35, generations worst is: 59
generations best is: 35, generations worst is: 61
generations best is: 35, generations worst is: 60
generations best is: 35, generations worst is: 61
generations best is: 35, generations worst is: 61
generations best is: 35, generations worst is: 59
generations best is: 35, generations worst is: 61
generations best is: 35, generations worst is: 59
generations best is: 35, generations worst is: 61
generations best is: 35, generations worst is: 62
generations best is: 35, generations worst is: 65
generations best is: 35, generations worst is: 61
generations best is: 35, generations worst is: 59
generations best is: 35, generations worst is: 59
generations best is: 35, generations worst is: 61
In [19]:
print(best_planning)
[array([[[ 0, 11,  9],
        [ 1,  6,  8],
        [ 2, 12,  2],
        [ 3, 17,  2],
        [ 4,  7,  2],
        [ 5, 12,  8],
        [ 6,  6,  8],
        [ 7, 18,  7],
        [ 8, 17,  8],
        [ 9,  6,  3],
        [10, 17,  3]],

       [[ 0, 18,  2],
        [ 1,  7,  2],
        [ 2, 12,  2],
        [ 3,  6,  8],
        [ 4, 17,  8],
        [ 5, 17,  8],
        [ 6,  6,  3],
        [ 7, 16,  4],
        [ 8,  6,  8],
        [ 9, 12,  8],
        [10, 12,  9]],

       [[ 0,  6,  3],
        [ 1, 15,  9],
        [ 2, 12,  2],
        [ 3, 16,  4],
        [ 4,  6,  8],
        [ 5, 17,  7],
        [ 6,  6,  9],
        [ 7, 18,  1],
        [ 8, 17,  3],
        [ 9, 12,  8],
        [10,  6,  3]],

       [[ 0, 12,  9],
        [ 1,  6,  8],
        [ 2,  9,  0],
        [ 3, 17,  3],
        [ 4, 12,  9],
        [ 5, 17,  9],
        [ 6,  5,  9],
        [ 7, 17,  9],
        [ 8,  6,  3],
        [ 9,  6,  3],
        [10, 17,  3]],

       [[ 0, 16,  9],
        [ 1,  6,  8],
        [ 2,  6,  5],
        [ 3, 12,  8],
        [ 4,  6,  3],
        [ 5,  8,  6],
        [ 6, 17,  3],
        [ 7, 13,  7],
        [ 8, 18,  2],
        [ 9, 12,  2],
        [10, 17,  9]]])]
In [20]:
print(staffplanning_to_hourlyplanning(best_planning[0]))
[[0 0 0 0 0 0 3 4 4 2 2 3 5 5 2 2 2 5 6 5 2 2 2 2]
 [0 0 0 0 0 0 3 4 4 2 2 2 5 5 2 2 3 5 6 6 3 2 2 2]
 [0 0 0 0 0 0 4 4 4 2 2 2 4 4 2 2 3 5 6 5 2 2 2 2]
 [0 0 0 0 0 1 4 4 4 2 2 2 4 4 2 2 2 6 6 6 4 2 2 2]
 [0 0 0 0 0 0 3 3 4 3 3 2 4 5 2 2 3 5 6 6 2 2 2 2]]
In [21]:
print(hourlystaff_needed)
[[0 0 0 0 0 0 4 4 4 2 2 2 6 6 2 2 2 6 6 6 2 2 2 2]
 [0 0 0 0 0 0 4 4 4 2 2 2 6 6 2 2 2 6 6 6 2 2 2 2]
 [0 0 0 0 0 0 4 4 4 2 2 2 6 6 2 2 2 6 6 6 2 2 2 2]
 [0 0 0 0 0 0 4 4 4 2 2 2 6 6 2 2 2 6 6 6 2 2 2 2]
 [0 0 0 0 0 0 4 4 4 2 2 2 6 6 2 2 2 6 6 6 2 2 2 2]]