According to Harvard Business Review, ~30% of companies have implemented any form of analytics in their companies. What does this mean if you are a small to medium scale business? Opportunity!
The path to getting customers these days continues to be more complex, with an average of six to eight touchpoints across multiple channels and devices before a conversion. So it’s become even more important to get a clear view of which channels contribute to your customer conversion funnel, using marketing attribution tools. For self-funded businesses, small scale entrepreneurs it becomes even more imperative to understand these concepts to make the most of their limited marketing budgets.
Any business that’s actively running promotional ads online should be interested in identifying and calculating what marketing channels drive the actual conversions. It is no secret that the return on investment (ROI) on your marketing efforts is a crucial KPI.
In this piece, I am looking to show every pure marketer a data science-y concept in the simplest of explanations. We are going to cover:
- Why is channel attribution important?
- The standard attribution models
- An advanced attribution model: Markov Chains (with a tiny bit of python code)
Why is attribution important?
As the array of platforms on which businesses can market to their customers is increasing, and most customers are engaging with your content on multiple channels, it’s now more important than ever to decide how you’re going to attribute conversions to channels.
To illustrate the importance of attribution, let’s consider a simple example of a user journey leading to conversion. In this example, our user is named Talish.
Day 1:
Talish loves watching technology youtube videos every morning. Phone reviews are his favourite videos! He watches MKBHD’s iPhone 12 review and his interest is sparked by watching Apple’s ad about the iPhone 12 right there. He visits the Apple website without a real intention of purchase, to browse through,
Day 2:
The next day, when Talish is scrolling through his Facebook newsfeed thinking about purchasing the new phone and he receives another ad for the iPhone 12. This pushes him to return to the Apple website and this time he completes the purchase!
Question! Which channel would you give credit for Talish’s purchase?
Traditionally, marketing attribution has been tackled by a handful of simple but powerful approaches such as First Touch, Last Touch, and Linear.
Standard Attribution Models
First Touch Attribution: The revenue generated by the purchase is attributed to the first marketing channel Talish engaged with, on the journey towards the purchase. First Touch Attribution has its advantages in simplicity, you risk oversimplifying your attribution approach. Think about it, was it the Ad that made Talish purchase? Although many times the first touch won’t be necessarily the marketing activity that generates the purchase, it often is an important one to engage the mind of the customer.
Last Touch Attribution: As the name suggests, the Last Touch is the attribution approach where any revenue generated is attributed to the marketing channel that a user last engaged with. While this approach has its advantage in its simplicity, you run the risk of oversimplifying your attribution,
In the above example, the last touch channel (Facebook) likely didn’t create 100% of the intent to purchase. The awareness stems from the initial spark of watching the YouTube ad.
Therefore why use these models? These models are simple, easy to use and often can be a decent indicator for channel attribution. Small entrepreneurs can definitely implement these by themselves giving them a huge advantage against those who don’t!
Linear Attribution
In this approach, the attribution is divided evenly among all the marketing channels touched by the user on the journey leading to a purchase.
This approach is better suited to capture the trend of the multi-channel touch behaviour we’re seeing in consumer behaviour. However, it does not distinguish between the different channels, and since not all consumer engagements with marketing efforts are equal this is a clear drawback of this model.
Other standard attribution approaches worth mentioning are Time Decay Attribution and Position-Based Attribution. These are great to know, but a little technical to implement.
- Time Decay: Time Decay attribution assigns more credit to touchpoints that happen right before a conversion. For example, it can be used if you have a long sales cycle that requires more hand-holding and resources to nurture a lead into a customer, as it assumes the first few touchpoints have less weight than the last few touchpoints.
- U-shaped (Position-based): U-shaped attribution gives 40% of the credit to both the first and last touchpoints, and divides the remaining 20% credit evenly to the touchpoints in between. It assumes how people find out about you and what leads them to convert are the most important parts.
An advanced attribution model: Markov Chains
With the three standard attribution approaches above, we have easy-to-implement models to identify the ROI of our marketing channels.
However, the drawback is that they are oversimplified. In the age of data, not using advance statistical and experimental methods inexcusable! Advance models give you a much ‘closer to reality’ like results which can be a game-changer when the difference between channels is too close to call.This oversight can be detrimental — misguiding future marketing decisions.
To overcome this oversight, we may consider employing a more advanced approach: Markov Model!
Markov chains are named after the Russian mathematician Andrey Markov, and describe a sequence of possible events in which the probability of each event depends only on the state attained in the previous event.
Example of a simple Markov chain with two events S and R
Markov chains, in the context of channel attribution, gives us a framework to model user journeys and how each channel factors into the users travelling from one channel to another to eventually purchase (or not).
The core concepts of Markov chains is that we can use the generated data to identify the probabilities of moving from one event to another in our network of potential marketing channel events and conversion events.
Three terms one must know before understanding how Markov models work
- State Space — Set of all states in which one could potentially exist. In other words, a set of different websites/pages online
- Transition Operation — Probability of moving from one state to another
- Current State probability state of distribution — probability distribution of being in one state or the other at the start of the process.
How Markov Models work in marketing attribution
Typically when a user buys or does a conversion, there are multiple paths to that destination. With this approach, you simply calculate the probability of conversion of different paths.
In order to attribute a channel, you calculate the removal effect of that channel. What does this mean? — If I remove Facebook from the user’s journey, what impact would it have on my final conversions?
Now it gets slightly more math-y but, bear with me.
Step 1- find all the different paths that lead to conversion. The above diagram there are two main paths —
Path 1 — Start -> Instagram -> Apple Website -> Conversion
Path 2 — Start -> Facebook -> Instagram -> Apple Website -> Conversion
Easy so for?
Step 2 — Calculate the probability of each path using the numbers in the above diagram
Path1 — 0.4 x 1.0 x 0.5
Path 2 — 06 x 0.5 x 1.0 x 0.5
Now, simply add the two probabilities — 0.4 x 1.0 x 0.5 + 06 x 0.5 x 1.0 x 0.5 = 0.35
This means that the probability of conversion is 35% for a user. The user can follow either of the two paths. However, we want to decide which channel is the most important of the above?
Step 3 — Choose a channel you want to remove and see the impact is has on the paths. For our case, we’ll remove Facebook.
Now simply repeat the steps.
Path left — Start -> Instagram -> Apple Website -> Conversion
Probability After removal = 0.4 x 1.0 x 0.5 = 0.20
Now to calculate the removal effect of Facebook find the difference between two calculated figures —
0.35-0.20 = 0.15
This number, 15% is the effect on conversions if Facebook is removed from the action. In other words, Apple can attribute 15% of all of the conversions to Facebook!
Analyzing the output of the Markov chain model will give you a “snapshot” of marketing channel effectiveness, at a specific point in time. You might be able to gain extra insights by looking at the model output for data just before and after a new marketing campaign launch, giving you essential information on how the campaign affected the performance of each channel.
I promised some code — you can see the full python implementation of the above concept here
It’s important to keep in mind that while the data set in this example holds a sizeable volume of data it only includes 4 marketing channels. In a real-world scenario, we’d likely be working with several times that a number of channels (significantly more if we applying a more granular model such as a campaign-specific attribution model), consequently increasing the complexity of the typical user journey and the need for an attribution model that favours this level of complexity.
Conclusion
Different marketing channel attribution approaches will fit different businesses. In this article, we’ve outlined a few possible ways to evaluate the effectiveness of your marketing budget. We’ve explored simple approaches that are fixed in the sense that they are not dependent on the structure of your data, which may lead to overconfidence. On the other hand, a Markov chain approach with removal effect will look to model channel attribution by accounting for how your user journey data is structured; though this approach is slightly more complex.
Assigning accurate credit to marketing channels can be a complex but rewarding task. Using the Markov Chain approach outlined in this article will allow your attributions to more accurately reflect how your users are interacting with your marketing.
By adding even more granularity and running daily attribution models, you could evaluate the relationship between PPC or marketing dollar spent and channel contribution using correlation models.
While adding more complexity to the approach presented in this article could increase the value of the model outputs, the real business value will come from being able to interpret these quantitative model results and combine these with domain knowledge on your business and the strategic business initiatives that have produced your data.
Combining these model results with the knowledge of your business will allow you to best incorporate the model findings into future initiatives.
Marketing channel attribution can be a complex task and with consumers being reached by more marketing than ever. As technology advances and more channels become available to marketers, it becomes more important to identify precisely the channels that are driving the most ROI. For small businesses and entrepreneurs to stay competitive, marketing attribution is perhaps the most valuable marketing science concept they can use.
About the Author
I am Talish — A marketer, Data Scientist and a photographer. I enjoy exploring concepts, ideas and techniques that help businesses communicate their stories, in other words, make better products.
Currently, I lead marketing and data science for a technology startup Ekta Flow where I work as a Senior Data Scientist for clients from the Department of Defence, healthcare, fraternities and other non-profits.