Giving back - how I cleared L6 System Design - Part 2
Part 3 is out: https://www.teamblind.com/post/qubF6fS2
This is part 2 of a series. Here is part 1 which has a long intro: https://www.teamblind.com/post/Giving-back---how-I-cleared-L6-System-Design---Part-1-4yufM3RY
0. Preface
Let me touch on some points from Part 1 flood of comments:
Is Nathan Bronson a porn star? I'm not sure if Nathan Bronson of Facebook has a side gig but I meant the other non-porn star Bronson.
YOE? 10. Don't let YOE hold you back though. I've always been somehow the youngest of my peers at something. Age? Early 30s.
Did I have coding rounds? Of course. I will write about that but it will have to wait. Plenty of coding hints on Blind on how to leetcode but I think the shortage is on SDI content so I want to plug that first.
How did I manage the time? I had a method which I'll share eventually.
---- end preface ----
- Back of the envelope calculations (BOTEC)
You expected something else? Sit down... I thought hard and I think this is something many people secretly fear and feel weak on. So since my goal in this effort is to genuinely help people, I think this is where my contribution is most valuable.
You might think why do we even need BOTEC? Short version is to size tiers. What? Size? Tiers? Sizing as a verb here means to estimate how many machines/disks/etc. you will need. Tiers in this context means the typical tier in any system development. For example, a logging/counting system can have three tiers: 1. the collection tier, 2. the aggregation tier and 3. the storage tier.
Another way you can use BOTEC is to figure out if something can fit in one machine or not, mostly memory wise. Most of the time it is self-evident, but 10 seconds of BOTEC should confirm it.
Example: How do you serve the timelines of 1 Million people efficiently? Well how many posts do you expect to have readily available per person? Let's say 10. Ok. If we store the IDs only (say 64 bit) we can use a redis native list to store list of posts:
8 bytes/post * 10 post/timeline=80 bytes/timeline (post goes away)
1 Million timelines * 80 bytes/timeline = 10^6*10^1*8=8*10^(6+1) bytes (timeline goes away)
so 80 million bytes means 80 MB - easily goes into one machine's RAM.
This was an easy example however note two things: always drag the units with you: bytes/post, post/timeline and reduce when possible. It's extremely helpful to not get lost. Second, always use powers of 10 to not miss a zero here and a zero there.
Here's the real life version of that.
-64 bit ID per post - 8 bytes/post
-800 posts/timeline
-500 million timelines
-replication factor of 3
800 posts / timeline * 8 bytes/post = (post goes away) 8*10^2 * 8 bytes/timeline= 64 * 10^2 bytes/timeline
500 million timelines* 64 * 10^2 bytes/timeline = (timeline goes away) 5 * 10^2 * 10^6 * 64 * 10 ^2 bytes= 5*64 * 10 ^(2+6+2) bytes= 300 * 10^10 = 3 *10^12 bytes = 3 TBs
3TBs * 3 (unitless replication factor) = ±10 TB. Considering 64 GB ram/machine out of which 50 can be considered usable you have 10 TB/(50GB/machine) = 10* 10^12 Bytes /(5*10^9)Bytes/machine= 2 *10^3 =2000 Machines (Bytes goes away, 1 over 1 over machine becomes machine).
The temptation here is to stop being pedantic with units but I suggest you don't. These can get messy so stick to it.
This tier can be considered to be memory-bound.
Look up Raffi Krikorian's Timelines at Scale on infoQ to see him talk about this at more length.
Usually you have different tiers with different scaling mechanisms. Do your calculations early to get a feeling of what is going to be the bounding factor, then plan ahead to scale that accordingly.
Here's an example of a QPS-bound tier. You're told you'll have to log 100 billion events per day. Ok. That sounds like a lot. Let's turn that into per second. Is that a calculator you took out? *takes it and throws out the window*
100 * 10^9 events/day [divided by] 86400 seconds/day
round to a convenient number
100 * 10^9 events/day [divided by] 80000 seconds/day=
10^11/(8*10^4) events/second (1 over day goes away)=1/8 *(10^7)Events/sec= 10/8 Million QPS=~1.2 Million QPS
This looks to be a QPS bound tier. To size it, divide it by some (numerically convenient) number that can be handled by one machine and you get the number of machines.
Same goes for storage sizing. Keep in mind the replication factor in storage and the amount of time you will be storing for. Look up datastax capacity planning for cassandra numbers, they're super useful.
And last but not least here are some numbers I used to reference quite often. They're taken from various sources and I used them without causing dropped jaws, so it should be safe.
- Compress 1KB with Zippy - 0.003 ms
- Send 1KB over 1Gbps 0.01 ms
- Read 4MB of sequential memory is 1 ms
- Read 4MB of sequential SSD is 20 ms
- Read 4MB of sequential HDD is 80 ms
- One single disk seek is 10 ms
- Inter-datacenter roundtrip 150ms
- Inter-datacenter bandwidth ±100 Gbps
- Video is roughly 10 MB/minute
- A server can have 100-150GB of usable RAM
- Cassandra is best used with 1-2TB storage / node
- A 1Gbps link can handle max of 125 MB /s
- Cassandra cluster can handle 10-15000 reqs/s/node and grows linearly with number of nodes
- 720p video - 1.5Mbps
- 340 video - 150Kbps
End of part 2
#SystemDesign #E6 #L6
Giving back - how I cleared L6 System Design - Part 3
comments
I would love to know a bit more about your study / retention strategies. You mentioned using Evernote in the last post; how did you organize your knowledge?
Did you review your notes frequently?
Were you practicing somehow to keep the ideas fresh?
10 TB / (50GB/machine) = 10* 10^12 Bytes /(50*10^9) Bytes/machine= 2 *10^2 =200 Machines
* Distributed systems basics (most of which might be covered in DDIA) - anything else you'd suggest?
* Advanced concepts for distributed systems (e.g. how exactly consensus work under the covers) - what exactly should we look at? Any relevant blog links?
* Algorithms which are pivotal to specific systems (Quadtree/Geohash for Uber, Google Maps, Tinder etc; Tries for Typeahead suggestions; Leaky bucket for rate limiting etc) - Was there any list which you went through which you can share?
* White papers which explain famous technologies under the hood (e.g. GFS, BigTable, Kafka, Paxos, Dynamo etc) - Can you share your list ?
* If we break down a simple system into multiple parts, each aspect has a lot of technologies that can come in : Client-Server (Netty/Poly..) ; Load Balancing (NGINX /Netscaler..), Data Processing (Spark, Flink..); Storage (Cassandra, DynamoDB etc. ) - Again did you have a list that you went through or you can share?
I feel overwhelmed with the amount of breadth this topic has and feel lost at times. Any guidance would help!
Also I think data modeling is also an important skill to learn. There is book called sql antipatterns which I thought was very interesting. Also another book about tree hierarchies in sql. Like how would you model tree data structures in sql etc(they call the methods as closure tables, nested sets, path enumeration or some fancy words like this 😂)
It’s sorta hard to wrap ones head around a recommendation systems for global services like YouTube. For instance, I may watch videos uploaded in Europe or Japan from the states. I can’t even comprehend how they maintain their global graph database to map user recommendations.