I recently had a MLE interview (E4). For ML design round, feedback I got is a borderline (weak, risky to hire) but I have no idea how badly I did I would like to ask blinders where I can improve. Here is very detail how I answered during the interview (My answer contains all detail of below) Question : Design a video recommendation engine. I followed exactly this paper : https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45530.pdf Casted this design as classification problem Data collection : Interviewer said there is no query data - Positive label : videos that user watched or hovered more than half of video time - Negative label : Any video does not satisfies positive labels Feature Engineering: Video embedding + average / user profile (age / nationality etc.. + added some more based on twitter engineering blogs and papers) / video Genre / # of likes in video / # of comments / statistics of video ( who watched "the" video in the past) - went through how we can convert to feed into a model Model : As mentioned above, followed exactly same way of Google paper (DNN) Training - negative sampling for faster training Inference : local selective search (approximate KNN) to compute similarity between user embedding / video embedding and sort based on the highest score for ranking Exactly same way Offline metric: Accuracy metric : how many videos were watched by a user from RecSys output nDCG for ranking score - defined relevance score based on weighted sum of # of likes / # of dislikes / # of comments and explained detail and discussed whats difference from (Gain to DCG to nDCG) Online metric: A/B testing and monitoring - User active session time - Revenue increase from ads in video - Avg user watching time Feedback from recruiter is all perspective not enough for E4, E4 expected to do better 1. Data collection / feature engineering have a gap from real world 2. Model is not well structured and has a gap from real world - no idea, I memorized a paper and explained even more detail (how approximate KNN works) 3. Offline / Online metrics has a gap from real world I have no ideas how to improve, I think I went through more than 50% of RecSys materials in the world except research SOTA papers Any comments where I did bad and how to improve will be appreciated Tax : 190k yoe : 3
While talking about the design did you ask the interviewer what he is looking for? For me I went through MLE for E5. I kept asking the interviewer what he was looking for and I moved my discussion in that direction. In fact I did not draw any diagram and we just kept on talking as if we were colleagues.
Exactly same experience, Asked his assumption, went through each area that interviewer wanted to touch upon. Only moved on when he agreed. Was attempting to draw a model and Interviewer rejected my suggestion.
Looks like you vomited your preset answer rather than asking interviewer what is he looking for.
I wish I did as you said, It was a conversation session I asked his assumption and he told me to focus which area - like Let's talk about data collections first -> move on to feat engineering etc.. and during discussion he asked "any further idea on data collection" if not let's move on. I wish I did throw whatever knowledge without communication, then I admit
What's your yoe op?
Forget about it OP. Try some other company in 6 months. Sometimes the interviews are noisy.
I think one thing you can improve on is discussing different approaches + trade-offs. There is no one right answer for recommender systems. There are many design paradigms. You can discuss the retrieval phase + ranking. Why would you use pointwise loss vs pairwise and listwise? For metrics I think you did well, I am not sure what you can add. Maybe more involved evaluation taking into account bias in data. From what I understood from this type of interview, it is all about giving a structured answer while keeping in mind all the different possibilities and trade-offs.
Probably improve on positive label definition.
did you talk about 2 stage: candidate gen + ranking? You can go fancy with reranking etc.
and how you train embedding.
Move on, just bad luck this time. This is already good for E4
Mentally I moved on, but just want to know what I did wrong so that I can "improve and do better next time" ..because I hardly see any materials that covers more than my answers Any tips will be appreciated