Misc.Jun 26, 2018
NewNerdBlonde

Sharding for HOT users

Suppose we have to design Instagram. We do sharding based on user ID. How do we handle it for HOT users like Brad Pit for 100s of followers? PS- Practicing System Design for Google

Add a comment
New
jtjw20 Jun 26, 2018

Projections

Yahoo 7331er Jun 26, 2018

caching

Oracle Madmaxf Jun 26, 2018

Rejected

Yahoo 7331er Jun 26, 2018

now we know why oracle is going down

Uber lLjv00 Jun 26, 2018

Google won't ask about Instagram, instead they'll ask about designing Google Plus Photos where the hottest person on there really is brad Pitt and he literally only has 100s (not millions) of followers on their clone app... With that out of the way there are tons of features for this app to design, but it sounds like you are interested in alerting users whenever brad posts a new photo. I doubt all of brads followers use the app all the time, only the most active users will really care to see his posts, everyone else who follows him logs in maybe once a day to once a month or even once a year or never. On top of that even those top users will only care about the most recent posts so a post from a day ago becomes irrelevant if he's posted 10 times in the past hour. So one approach could be in the case of a hottie like Mr Pitt to make the app really fast for top users and really fast for most recent photos. And slow for everyone else and older posts. Reading from a cache is really fast, but storing data in ram is expensive. Sounds like this becomes a problem of decided who's news feeds are cached and who's aren't? Which posts are cached and who's aren't?

Netflix yesottoman Jun 26, 2018

replicas

New
NerdBlonde OP Jun 26, 2018

Thanks!

Oracle nogo Jun 26, 2018

Going through company blogs help. Also this a good resource https://github.com/checkcheckzz/system-design-interview

Amazon G00G-BOS Jun 26, 2018

If Brad Pitt is a relatively infrequent poster, then I’m not sure what you need beyond simple a simple write through cache for holding post metadata and the image itself being cached at various levels (CDN, server mem, load from s3). Maybe you could have a fancier cache to always cache posts of popular users but there’s something to be said about how popular a user really is if he can’t even stay in your lru cache. If you need to send notifications to all his followers or something, that’s a bit harder assuming there’s maybe 2 million of them. Normally, You’d queue up a notification saying user x has a new post and some worker picks it up and sends push notifications to all the users followers, but not quite sure how a prod system would work. Maybe yet another layer of indirection where reasonably sized sets of notifications are bundled up and sent out to another cluster.

New
NerdBlonde OP Jun 26, 2018

Thanks a lot everyone!