Simple sales forecast with Excel and Python

Posted on 2/4/2022

Simple end of month sales forecast

Introduction

Let's imagine that you are sitting at your desk, sipping a coffee that you paid CHF 0.70, when your boss come into your office and ask you to produce a report giving the expected end of month sales per BU, based on the sales Month to day.

Now forecasting is a very tricky thing to do, taking in account currency fluctuations, raw material price variations, weather changes, seasonality and whatnot.

Fortunately, your boss just want to have a broad view of sales so a simple linear regression will do. You sigh in relief because you won't have to dig into R time series forecasting package or figure out how to create a machine learning model.

So, you think you can extract a Month to day sales per company and BU from the ERP, open it in excel and extrapolate by applying the simple formula:

(Month to date value * Current day of month) / total days in month

Simple enough, but you quiclky notice that the forecasted values are way too small. Off course ! We don't sell anything on weekends and during holidays, so your extrapolation should take that into account.

And here comes the headaches.

Working days

OK, not an issue, Excel provides a function to calculate the number of working days between two dates (NETWORKDAYS). You can then add a formula that will calculate the number of working days since begin of month and day of the month, and another one that will calculate the number of working day for the month

So lets imagine your worksheet looks like this

  SimpleDataStructure

with columns MTDDate;Country;Sales

you will add two columns, one to calculate the number of working days elapsed since beginning of the month, the other to calculate to total number of working days in the month.

=NETWORKDAYS.INTL([@MTDDate],DATE(YEAR([@MTDDate]),MONTH([@MTDDate]),1),1)
=NETWORKDAYS.INTL(DATE(YEAR([@MTDDate]),MONTH([@MTDDate]),1),DATE(YEAR([@MTDDate]),MONTH([@MTDDate])+1,1)-1,1,getHolidayDatesByCountry(Holidays!$A$2:$C$24,[@Country]))

Note that in the example we consider all the weekend days are on Saturday and sunday (the last parameter value of the function).

Regional adaptions

What we haven't taken in account yet in our business day computation, are the holidays. If your companies are located in the same region, like for instance in Europe where the culture is the more or less the same between countries, that will not have much of an impact. The challenge will come when you will need to aggregate differents regions of the world like Asia and Middle East where you can have longer period of holidays than in Europe. So you need to find the way to extract a list of holidays per year and per country.

And here starts the trouble. You will find different sources of data giving you the holidays corresponding to each country, but you will need to aggregate them yourself. Unless off courrse if your company is not as stingy as one that bill CHF 0.70 a coffee for its fellow employees, you can buy these kind of information. In my case, I had to go to a site, selecting a country, refresh, extract, copy paste in a worksheet. This process can be very tedious and error prone. Fortunately there is a way around !

Python to the rescue

Each Excel User should have Python installed on his computer. The services it offers in data manipulation are invaluable. Even if it is not (yet) fully integrated in Excel we can already uses its numerous libraries to achieve our goals.

Python packages

We will use the holidays library to extract the holiday dates per year and per country and the pandas libraby that is a must have for anyone interested in data manipulation. I will probably have numerous opportunities to write about Pandas later, but for our today's exercice, we will just concentrate on extracting the holidays. So here we go : in a shell window we will first install the needed packages:

pip install pandas
pip install openpyxl
pip install holidays

Then we will use the following pyhton code :

from datetime import time
import pandas as pd
import holidays

allHolidays = []
YEARSLIST = [2022]
COUNTRYLIST = ['FR','BE','CH','DE','CN']

for country in COUNTRYLIST:
    for hol in holidays.CountryHoliday(country,years=YEARSLIST,observed=False).items():
        allHolidays.append({'Country':country, 'Date':hol[0],'Holiday':hol[1]})

allHolidays = pd.DataFrame(allHolidays)
allHolidays.to_excel("myHolidays.xlsx", index=False)

This code will iterate the countries in the COUTRYLIST array, and then append the allHoliday matrix with a dictionary containing the Country, Date and Holidays description extracted from the holidays library.

Then it wil be converted into a Pandas DataFrame, that we will use to create an excel packages looking like this:

HolidaysInExcelFile

We will then copy the worksheet into our main Excel workbook so that we can refer to it.

Updating the Networkdays formula

What we need to do now is to update the networkdays formula with the holidays parameter. In the formulas I showed earlier, I did not use the holidays parameter. This parameter allows you to use a range of dates that will be considered as holidays dates. If you have Office 365, you can use the FILTER formula that return a range with filter criteria applied. I do not have Office 365, so I wrote the following vba function instead that will return a vector of dates related to the country passed as a parameter.

Function getHolidayDatesByCountry(rng As Range, Country As String) As Variant
Dim returnValue() As Variant
    ix = 0
    ReDim Preserve returnValue(0)
    For Each r In rng.Rows
        If r.Cells(1) = Country Then
            ' we exclude weekends from list of holidays since we
            ' have already excluded them from our workday formula
            If Application.WorksheetFunction.Weekday(r.Cells(2), 2) < 6 Then
                returnValue(ix) = r.Cells(2).Value
                ix = ix + 1
                ReDim Preserve returnValue(ix)
            End If
        End If
    Next

    getHolidayDatesByCountry = returnValue

End Function

Which will lead to the adapted formulas for the computation of working days:

=NETWORKDAYS.INTL(DATE(YEAR([@MTDDate]),MONTH([@MTDDate]),1),[@MTDDate],1,getHolidayDatesByCountry(Holidays!$A$2:$C$53,[@Country]))
=NETWORKDAYS.INTL(DATE(YEAR([@MTDDate]),MONTH([@MTDDate]),1),DATE(YEAR([@MTDDate]),MONTH([@MTDDate])+1,1)-1,1,getHolidayDatesByCountry(Holidays!$A$2:$C$24,[@Country]))

Et voilà ! Your Estimated sales formula will be:

=[@Sales]/[@[Working days to date]]*[@[Total working days in month]]

Just paste down the formula and you will get your end of month sales estimate.