Marketing Mix Modeling Step-by-Step

Credit: Morten Hegewald

Rough on how to build your own Marketing Mix Model we’ll touch on:

  • What is a Marketing Mix Model?
  • Data requirements
  • Marketing carry over effects
  • Diminishing return functions

In part 2 we’ll get into use cases from the output and wrap up the walkthrough by touching on:

  • Channel contributions
  • Predictive budget optimizations

What is a Marketing Mix Model?

With the number of marketing channels through which we can reach new potential clients increasing every year, it’s more important than ever to understand which channels are driving the business forward and which are not.

“Half the money I spend on advertising is wasted; the trouble is I don’t know which half.” — John Wanamaker

In the world of online click-based marketing it’s fairly straightforward to identify channel performance, since, with the right back-end implementations in place, we can track click-throughs and assign attribution accordingly.

There are however other challenges with click-based marketing such as distribution of credit (attribution) when consumers have multiple touchpoints prior to a conversion.

I have a previous article outlining this challenge with both heuristics as well as a data-driven solution to solve it: Marketing Channel Attribution with Markov Chains in Python.

For offline marketing (TV, radio, billboards, print, etc.) however, it’s a completely different ball game as we have no way of tracking individual impressions, and thus Coca-Cola wouldn’t know that client X went and bought a soft drink because they were exposed to the billboard ad in the photo above.

So, if we can’t know which clients came to us because of our offline marketing, but still need to identify the performance of our offline marketing spend, how do we approach this challenge?

While we can’t identify individual attribution in this case, we can use historical data and math/statistics to identify the total attributions over time for each offline channel.

This is where the Marketing Mix Model comes into play.


A Marketing Mix Model (MMM) can from a high level be characterized as a statistical modeling technique that seeks to identify the relationship between your marketing spend in each individual channel and desired outcome (website visits, sales, client acquisitions, etc.).

The MMM uses historical data and regression techniques to tease out each channel’s contribution to your KPI. This is essentially done by identifying the variations in channel spend and the corresponding variations in the KPI. Therefore, it’s crucial that we actually have variation in our marketing spend, so the model can figure out how every little fluctuation impacted the output.

From a very simplistic perspective, the MMM can essentially be boiled down to the following formula:

kMHCeCL08GnYYaOWCc5Zd1eJ RStecyrNxzsPlMK6BLFBRs1 wWlAX d n27NoJfYLdE9 J6f ia7SvRRzJtjmhSN4g5 yFSxQh oSfUiUQeRVlsw9qZL965JKXPLU398nDBG88p

where S_t is your total KPI (sales, website visits, client acquisitions, etc.) at time t, beta_0 to beta_n are the function coefficients that this exercise seeks to identify and x_ti is the input to factor I at time t (we say factor here and not marketing channel because your MMM is likely to also include external factors that we aren’t in control of — more on this in the next section).

What identifying the coefficients of this formula will give you, is an accurate overview of how much any incremental dollar spent on any marketing channel will impact your KPI.

Data Requirements For a Marketing Mix Model

Sounds good so far? Great! Now the first question you should be asking is: What kind of data do I need to have/start collecting in order to put together a successful MMM?

Firstly, because the MMM works by looking at the variations in multiple inputs and a single dependent variable, it’s critical that we have enough data with sufficient variation for the model to properly identify the impact of any variation in a variable.

Secondly, the amount of data is a balancing act between having enough data for your model to accurately determine the correct variable relationships and the data still being representative for your business.

Thirdly, in terms of granularity, a good thing to keep in mind is: granular data will lead to granular insights. If your desired insight from the MMM is at a store/segment/product level you’ll need to have input data at this level of granularity as well.

Finally, since your business and its sales, website visits, client acquisitions, etc. are likely to be impacted by external factors such as seasonality, economic up or downswings, etc. you’ll be able to reduce the noise in your model if you feed this information to it. Given your model information on these external factors that may have impacted your KPI, you’ll minimize the chance that the model tries to map marketing expenditures to variations where there’s no actual connection.

Marketing Carry-Over Effects

Not all marketing sees an immediate effect. Many if not most consumers today go through a decision-making or consideration phase that starts when the awareness is created until a decision is made to either purchase or not purchases.

Therefore, there is a time difference between when we put any marketing into the world and when we see a visit, purchase, signup, etc. This time difference is generally referred to as the carry-over effect.

Different products will have different consideration phases (you probably need less convincing to buy groceries for dinner than you do to buy a new laptop).

The consideration phase is represented through a decay rate, which is the rate at which the marketing expenditures decays from one period to another.

The carry-over effect can be represented as:

w57p

where A_t is the actual effect of our marketing at time t and lambda is the decay rate.

The decay rate is going to differ by marketing channel, so we’ll now have tested out which level of decay best fits our data.

Diminishing Returns

The concept of diminishing returns means that the first dollar spent is more effective than the second, and the second is more effective than the third, and so on…

OLkj1HVbvJOB4F8fyrZ238ZOlwC3o9K2YyWI vWjAiEVtfKBmH

For this context, the concept is applied through the underlying assumption that exposure to marketing only generates awareness in the minds of consumers to a certain point after which we start to see a saturation effect — meaning our additional dollars become less and less efficient at moving the needle, i.e. additional spend in the channel does not generate additional awareness.

There are multiple ways to model these relationships of diminishing returns (negative exponential function, power function, etc.), so it’s now up to us to test which approach gives the best fit to the data.

Given this notion of saturation and a definition of the monetary value we’d assign to getting an additional KPI unit, we can find an optimal level of marketing expenditures for each channel where another dollar towards the channel no longer yields sufficient output — this would be the optimal level of spend (more on this in part 2).

As we can see from the above chart this represents a non-linear relationship between marketing expenditures and expected KPI output. What this means for the MMM formula we saw earlier is that it now becomes a sum of both linear and non-linear parts.

Because of the non-linear nature of the MMM, the models are usually implemented in Python or R for efficiency since these languages have excellent statistical libraries available to carry a lot of the heavy load.