Marginal Contribution & Shapley Values
Since our article Better Attribution: Using Clickstream Data and Shapley Analysis to Get More Accurate CPA & ROAS, there have been some questions asked about marginal contribution and Shapley values.Marginal Contribution
A straight-forward way to understand marginal contribution is to consider the problem of how to allocate the cost of building a new runway between four aircraft that need different runway lengths.
Aircraft | Runway Required | |
A | 8 | |
B | 11 | |
C | 13 | |
D | 18 |
Runway Needs |
Aircraft D is the only one that needs the last 5 runway units.
Aircraft C and D are the only ones that need the penultimate 2 units.
B, C and D need 3 common units.
All four need 8 units.
One way to allocate cost is to take the marginal cost (MC) for each segment and divide it by the number of beneficiaries.
Aircraft | A | +B | +C | +D |
MC | 8 | 3 | 2 | 5 |
# Aircraft benefitting | 4 | 3 | 2 | 1 |
Cost per aircraft | 2 | 1 | 1 | 5 |
Thus, the cost/value allocated to each aircraft is as follows:
Cost to A | 2 | |||
Cost to B | 2 | 1 | ||
Cost to C | 2 | 1 | 1 | |
Cost to D | 2 | 1 | 1 | 5 |
The total of each row in the runway problem is an equitable way to assign runway cost to each aircraft.
Assigned Cost | |||||
Cost to A | 2 | 2 | |||
Cost to B | 2 | 1 | 3 | ||
Cost to C | 2 | 1 | 1 | 4 | |
Cost to D | 2 | 1 | 1 | 5 | 9 |
18 |
Shapley Value
Shapley values are essentially averages of the cost/benefit for each participant. It is normally used in scenarios where the different players can participate in different orders.
Thus in a 4-player scenario, the following permutations need to be considered when calculating the Shapley value:
A | B | C | D |
A | B | D | C |
A | C | B | D |
A | C | D | B |
A | D | B | C |
A | D | C | B |
B | A | C | D |
B | A | D | C |
B | C | A | D |
B | C | D | A |
B | D | A | C |
B | D | C | A |
C | A | B | D |
C | A | D | B |
C | B | A | D |
C | B | D | A |
C | D | A | B |
C | D | B | A |
D | A | B | C |
D | A | C | B |
D | B | A | C |
D | B | C | A |
D | C | A | B |
D | C | B | A |
In the runway scenario, rearranging the aircraft makes no sense.
The Glove Game
Order makes sense in many other cases, however.
Consider the following scenario where order matters:
You are searching for a pair of gloves in a box with 1 left glove and 2 right gloves.
When you have a pair of gloves, the game is won.
If you want to figure out the value of each glove in the outcome, order matters.
Consider the following scenario where order matters:
Glove 1 | Glove 2 | Glove 3 | Win Credit |
---|---|---|---|
L | R1 | R2 | R1 |
L | R2 | R1 | R2 |
R1 | L | R2 | L |
R1 | R2 | L | L |
R2 | L | R1 | L |
R2 | R1 | L | L |
The Shapley value of the Left glove is ⅔ (it gets credit for 4 out of the 6 wins), the Shapley value of each Right glove is ⅙ (each gets credit for 1 out of 6 wins).
Application to Online Advertising
The application to online advertising is straightforward.
Build a Table of Media Permutations
First, build a table with all of the orderings.
If a particular media type contributes more than once, consider it as a different media type for the purpose of building the table -- you will be able to figure out if a media type contributes more than once by looking at the source for the first pageview of each session and then building a table of all of the media types that brought a particular user to your site.
Media 1 | Media 2 | Media 3 |
L | R1 | R2 |
L | R2 | R1 |
R1 | L | R2 |
R1 | R2 | L |
R2 | L | R1 |
R2 | R1 | L |
Outcome Table
Next, figure out the outcome after each step, entering a "1" for a win.
Order | Outcome @ 1 | Outcome @ 2 | Outcome @ 3 |
L, R1, R2 | 0 | 1 | 1 |
L, R2, R1 | 0 | 1 | 1 |
R1, L, R2 | 0 | 1 | 1 |
R1, R2, L | 0 | 0 | 1 |
R2, L, R1 | 0 | 1 | 1 |
R2, R1, L | 0 | 0 | 1 |
[In advertising, there will normally be wins in column #1 and the number of wins will generally increase from left to right; since we are extending the glove game example, column #1 always has zero wins and the number of wins doesn't change in column #3 if there was a win in column #2. Please see our previous article for a more realistic table.]
Marginal Contribution
Next, for columns after the first column subtract the value of the preceding column from the current column (for the first column simply carry over the value).
Order | Outcome @ 1 | Outcome @ 2 | Outcome @ 3 |
L, R1, R2 | 0 | 1 | 0 |
L, R2, R1 | 0 | 1 | 0 |
R1, L, R2 | 0 | 1 | 0 |
R1, R2, L | 0 | 0 | 1 |
R2, L, R1 | 0 | 1 | 0 |
R2, R1, L | 0 | 0 | 1 |
If you are using Google Sheets, the formula for doing this looks like this: =ARRAYFORMULA(F14:F19-I14:I19)
Figuring Out the Value of Each Media Player in Each Row
Next, for each row, grab the value associated with each media type.
L | R1 | R2 |
0 | 1 | 0 |
0 | 0 | 1 |
1 | 0 | 0 |
1 | 0 | 0 |
1 | 0 | 0 |
1 | 0 | 0 |
If you are using Google Sheets, the formula for doing this looks like this: =SUMPRODUCT(($A14:$C14=L$13)*($I14:$K14))
Shapley Value
Finally, sum each column and divide by the number of rows -- this is the Shapley value.
L | R1 | R2 |
0 | 1 | 0 |
0 | 0 | 1 |
1 | 0 | 0 |
1 | 0 | 0 |
1 | 0 | 0 |
1 | 0 | 0 |
0.67 | 0.17 | 0.17 |
If you are using Google Sheets, the formula for doing this looks like this: =SUM(L14:L19)/ROWS(L14:L19)
Conclusion
Building a table of each ordering of media that contributes to a website conversion and then computing the marginal contribution at each step allows you to calculate the Shapley value of each media type.
Normalizing the Shapley values allows you to assign win/conversion percentages to each media type.
As shown in our previous article, doing this allows you to more accurately assign cost and value to your online efforts and will allow you to make better decisions about your marketing investments.
No comments:
Post a Comment