Hi everyone! Had an A/B testing question that I’m trying to wrap my head around and could use some help.
Let’s say you’re Airbnb and you add a new feature on your mobile app search results which puts a plus sign next to every listing returned in search that users can click for a quick preview of the listing (eg ratings, top 3 reviews etc). To book listing you still need to click on the search result and go through the full booking flow.
The hypothesis is this new feature will drive more (bookings/impressions) = success metric.
Let’s say the variant (which got this new feature) shows a significant positive lift in success metric.
1. Is this data sufficient to conclude that the new ‘plus sign’ feature drove incremental conversion?
2. Do you ever take the winning variant and deep-dive to figure out how many users in it actually even clicked on ‘plus sign’?
Hope my question was clear. Thank you for taking the time to respond!
You’ll need a very large sample size since your success metric is an interaction effect as compared to measuring “View Listing Detail” (or whatever you guys call it) which is immediate effect.
For causal inferences, I’d suggest doing the test with 2 variants:
Control A: Nothing
Variant B: Plus Sign (No preview) - Just track user interactions on it. Compare total bookings/impression against Control. Also Stratify bookings/impression on users that clicked vs that did not.
Variant C: Plus Sign (With Preview) - Again stratify bookings/impressions on users that clicked vs that did not.
B vs A will give you pure causal effect of just the plus sign on your success metric. C vs B stratified comparison will tell if preview leads to additional bookings. If C vs B shows that additional bookings come in due to preview then you know which version of plus sign you want to launch.
Note: in 2 test variations your alpha should be sliced by half (see Bonferroni correction for simplicity) to lower the probability of false positives.