Tech IndustryNov 29, 2019
Amazonffjjddb

ML design(how to select people to give discount code?)

failed ML rounds with this....(got feedback, ML design was too weak) Q: same as title, you can think any service such as lyft, uber, amazon, ebay etc limits: fixed limit of coupons or code (10000) target: Users who can boost our revenue(giving discount coupon is not only for the customer but for company too. assume service just launched. no previous record for coupon distributed briefly summarize what i said, 0.this is binary classification of(good for us, not good for us). then I will get prob score and will give coupon to top 10000. 0-1. generate label. 1 for each person if his total sum(spend)>avg(total spend from users) else 0. i would collect 1.the time series data(frequency they use service, amount they generated per month, weeks, year) 2.see the distrivution of each. try to log them. then if they is not normal well, i would try with tree based algo such as catboost since tree based are strong due to their nature of selecting random combination of features. 3. set this as a bench mark, try with lstm, ensemble etc. 4. i did mention about lstm and ensemble details like a classic ML101. (strength, how to deal with overfitting). any pointers or harsh criticism? hope i can learn and improve myself during this holidays! happy thanks giving to all

Add a comment
Cisco xhzhxh Nov 29, 2019

Coupons for everyone. Sales while supplies last. Generally this is artificially constrained problem

Walmart serjorah Nov 29, 2019

What’s your yoe?

Amazon ffjjddb OP Nov 29, 2019

Was new grad and now 0.3lol...

Microsoft rusnir7 Nov 29, 2019

Should you give coupons to your bigger spenders or your littler spenders? One would think you want to get people hooked so offer coupons to reel them in. How would you test the efficacy of the coupon outreach? Let’s say you do this yearly. How can you design coupon distribution so that you can reliably without bias measure the outcome of your intervention, and the variables of the user profile that contributed to it? Randomization, causality, and so on. Other than historical interactions on the service what other modeling feature extraction data compression can you do here? Recommended system style ideas maybe? Clusters? Attribute information ? In absence of historical data how can you bootstrap with other features for making your prediction? Etc. this is off the top of my head, couple minutes. IMO you made a good start but depending on yoe can be expected to add more nuance.

Wayfair cdpU22 Nov 29, 2019

You need to calculate the LIFT!

Wayfair cdpU22 Nov 29, 2019

lift = Pr(buy | coupon) - Pr(buy | no coupon)

Amazon ffjjddb OP Nov 29, 2019

I was wondering wht that is and thanks for the clarification. Do you suggest this as one of the feature? Or Y itself and then set the problem as regression type? If so, Then set the threshold and give coupons who get lift score above then threshold..? Total dumb here..

Lyft snake a Nov 30, 2019

You need to talk about the signal, define your goal and then talk about the ML model. Not build up to it, but actually talk about what’s a good model based on what’s used in prod nowadays (get that info from kdd talks) Than you need to talk about tuning hyperparameters via cross validation and explain your loss function etc. then you probs wanna talk about data normalization, choosing and tuning features with gradient boost or something. At this point, if your interviewer is happy with the model, you talk about larger architecture pic, offline and online training, how to reduce inference latency, how you store models and different functions and how you wanna avoid data drift Afterwards, a/b testing and deployment frameworks Caviat: not working on ML at Lyft but have passed ml interviews at other companies previously

Salesforce ;9;8382)( Nov 30, 2019

I would ask follow up questions what data is stored? Have coupons been used before? Knowing this will determine if you have lots of coupon customer data to train on or not. In case coupons are a fresh idea by the company then you need to think differently. If you don't have coupon training data, Can you leverage shopping data from sales seasons such as Thanksgiving to be able to identify users with a spike in buying behavior? Now you need to build a model that infers E(revenue to company | user X is given coupon) for this you have to factor in number of purchases user X makes, how much money they save for themselves, users past purchases etc in addition to everything else as features. You now have info to build a supervised model. You need to think about metrics - a/b testing etc. How will the system adapt to new information? How will the system adapt to users of different types? A one size fits all model rarely generalizes well. So user clustering might be a choice here. Now you need to build a system level ml block diagram for training. Then go into algorithmic choices. Now for inference, do you need real time inference, what are the scaling challenges you might encounter here, talk about all this and if you have ideas on what libraries you'll use to implement go there as well. Again I don't know what the interviewer is looking for, this is just how I'd approach..