Giving back - how I cleared L6 System Design - Part 2

May 4, 2021 112 Comments

Part 3 is out: https://www.teamblind.com/post/qubF6fS2

This is part 2 of a series. Here is part 1 which has a long intro: https://www.teamblind.com/post/Giving-back---how-I-cleared-L6-System-Design---Part-1-4yufM3RY

0. Preface

Let me touch on some points from Part 1 flood of comments:

Is Nathan Bronson a porn star? I'm not sure if Nathan Bronson of Facebook has a side gig but I meant the other non-porn star Bronson.

YOE? 10. Don't let YOE hold you back though. I've always been somehow the youngest of my peers at something. Age? Early 30s.

Did I have coding rounds? Of course. I will write about that but it will have to wait. Plenty of coding hints on Blind on how to leetcode but I think the shortage is on SDI content so I want to plug that first.

How did I manage the time? I had a method which I'll share eventually.

---- end preface ----

- Back of the envelope calculations (BOTEC)

You expected something else? Sit down... I thought hard and I think this is something many people secretly fear and feel weak on. So since my goal in this effort is to genuinely help people, I think this is where my contribution is most valuable.

You might think why do we even need BOTEC? Short version is to size tiers. What? Size? Tiers? Sizing as a verb here means to estimate how many machines/disks/etc. you will need. Tiers in this context means the typical tier in any system development. For example, a logging/counting system can have three tiers: 1. the collection tier, 2. the aggregation tier and 3. the storage tier.

Another way you can use BOTEC is to figure out if something can fit in one machine or not, mostly memory wise. Most of the time it is self-evident, but 10 seconds of BOTEC should confirm it.

Example: How do you serve the timelines of 1 Million people efficiently? Well how many posts do you expect to have readily available per person? Let's say 10. Ok. If we store the IDs only (say 64 bit) we can use a redis native list to store list of posts:

8 bytes/post * 10 post/timeline=80 bytes/timeline (post goes away)
1 Million timelines * 80 bytes/timeline = 10^6*10^1*8=8*10^(6+1) bytes (timeline goes away)

so 80 million bytes means 80 MB - easily goes into one machine's RAM.

This was an easy example however note two things: always drag the units with you: bytes/post, post/timeline and reduce when possible. It's extremely helpful to not get lost. Second, always use powers of 10 to not miss a zero here and a zero there.

Here's the real life version of that.
-64 bit ID per post - 8 bytes/post
-800 posts/timeline
-500 million timelines
-replication factor of 3

800 posts / timeline * 8 bytes/post = (post goes away) 8*10^2 * 8 bytes/timeline= 64 * 10^2 bytes/timeline

500 million timelines* 64 * 10^2 bytes/timeline = (timeline goes away) 5 * 10^2 * 10^6 * 64 * 10 ^2 bytes= 5*64 * 10 ^(2+6+2) bytes= 300 * 10^10 = 3 *10^12 bytes = 3 TBs

3TBs * 3 (unitless replication factor) = ±10 TB. Considering 64 GB ram/machine out of which 50 can be considered usable you have 10 TB/(50GB/machine) = 10* 10^12 Bytes /(5*10^9)Bytes/machine= 2 *10^3 =2000 Machines (Bytes goes away, 1 over 1 over machine becomes machine).

The temptation here is to stop being pedantic with units but I suggest you don't. These can get messy so stick to it.

This tier can be considered to be memory-bound.

Look up Raffi Krikorian's Timelines at Scale on infoQ to see him talk about this at more length.

Usually you have different tiers with different scaling mechanisms. Do your calculations early to get a feeling of what is going to be the bounding factor, then plan ahead to scale that accordingly.

Here's an example of a QPS-bound tier. You're told you'll have to log 100 billion events per day. Ok. That sounds like a lot. Let's turn that into per second. Is that a calculator you took out? *takes it and throws out the window*

100 * 10^9 events/day [divided by] 86400 seconds/day
round to a convenient number
100 * 10^9 events/day [divided by] 80000 seconds/day=
10^11/(8*10^4) events/second (1 over day goes away)=1/8 *(10^7)Events/sec= 10/8 Million QPS=~1.2 Million QPS

This looks to be a QPS bound tier. To size it, divide it by some (numerically convenient) number that can be handled by one machine and you get the number of machines.

Same goes for storage sizing. Keep in mind the replication factor in storage and the amount of time you will be storing for. Look up datastax capacity planning for cassandra numbers, they're super useful.

And last but not least here are some numbers I used to reference quite often. They're taken from various sources and I used them without causing dropped jaws, so it should be safe.

- Compress 1KB with Zippy - 0.003 ms
- Send 1KB over 1Gbps 0.01 ms
- Read 4MB of sequential memory is 1 ms
- Read 4MB of sequential SSD is 20 ms
- Read 4MB of sequential HDD is 80 ms
- One single disk seek is 10 ms
- Inter-datacenter roundtrip 150ms
- Inter-datacenter bandwidth ±100 Gbps
- Video is roughly 10 MB/minute
- A server can have 100-150GB of usable RAM
- Cassandra is best used with 1-2TB storage / node
- A 1Gbps link can handle max of 125 MB /s
- Cassandra cluster can handle 10-15000 reqs/s/node and grows linearly with number of nodes
- 720p video - 1.5Mbps
- 340 video - 150Kbps

End of part 2
#SystemDesign #E6 #L6

comments

Want to comment? LOG IN or SIGN UP
TOP 112 Comments
  • Facebook
    BillYindzn

    Go to company page Facebook

    BillYindzn
    Thanks so much for this OP. The value of what you're sharing here can't be overstated; it's going to help so many people.

    I would love to know a bit more about your study / retention strategies. You mentioned using Evernote in the last post; how did you organize your knowledge?

    Did you review your notes frequently?

    Were you practicing somehow to keep the ideas fresh?
    May 4, 2021 2
  • I think there is a typo in calculation, there must be 200 machines:
    10 TB / (50GB/machine) = 10* 10^12 Bytes /(50*10^9) Bytes/machine= 2 *10^2 =200 Machines
    May 4, 2021 3
  • Amazon
    IAmZoned

    Go to company page Amazon

    IAmZoned
    Thanks a lot for putting both the posts together. Would you mind sharing a structured (non-negotiable/not to be skipped list) for your approach? The reason am asking for this because reading about SDI seems to be an endless exercise. For example, from the limited knowledge I have the way I see it there are multiple aspects to it:

    * Distributed systems basics (most of which might be covered in DDIA) - anything else you'd suggest?
    * Advanced concepts for distributed systems (e.g. how exactly consensus work under the covers) - what exactly should we look at? Any relevant blog links?
    * Algorithms which are pivotal to specific systems (Quadtree/Geohash for Uber, Google Maps, Tinder etc; Tries for Typeahead suggestions; Leaky bucket for rate limiting etc) - Was there any list which you went through which you can share?
    * White papers which explain famous technologies under the hood (e.g. GFS, BigTable, Kafka, Paxos, Dynamo etc) - Can you share your list ?
    * If we break down a simple system into multiple parts, each aspect has a lot of technologies that can come in : Client-Server (Netty/Poly..) ; Load Balancing (NGINX /Netscaler..), Data Processing (Spark, Flink..); Storage (Cassandra, DynamoDB etc. ) - Again did you have a list that you went through or you can share?

    I feel overwhelmed with the amount of breadth this topic has and feel lost at times. Any guidance would help!
    May 4, 2021 1
    • What about reading one book per technology? Like kafka, etcd, zookeeper etc? And I don't think ddia has all info either. I read the book last week just at a glance and it just introduces a lot of content. Of course I did not know a lot of these before, but it didn't go in depth in each topic and not to mention it was mostly generalistic info. It didn't talk about domain specific info like geo data vs video data vs Healthcare data etc.

      Also I think data modeling is also an important skill to learn. There is book called sql antipatterns which I thought was very interesting. Also another book about tree hierarchies in sql. Like how would you model tree data structures in sql etc(they call the methods as closure tables, nested sets, path enumeration or some fancy words like this 😂)
      May 5, 2021
  • Amazon / Eng
    ALLisWell😎

    Go to company page Amazon Eng

    ALLisWell😎
    Good one. I have a memory, RPS and QPS calculation template which I always remember. When assuming, instead of asking how many users or how many requests, I kind of propose a number. “Can I assume 1 million requests/sec or 500m customers”. And I try to fit all the questions around that range, keeping it same and simple for all interviews.
    May 4, 2021 7
    • Amazon
      drytexvno

      Go to company page Amazon

      drytexvno
      Great stuff all around.

      It’s sorta hard to wrap ones head around a recommendation systems for global services like YouTube. For instance, I may watch videos uploaded in Europe or Japan from the states. I can’t even comprehend how they maintain their global graph database to map user recommendations.
      May 4, 2021
    • Ford
      EvCA25

      Go to company page Ford

      EvCA25
      Amazing post and great follow ups
      May 4, 2021
  • Thanks for being nice and contributing to this community.
    May 4, 2021 0