So, I'm trying to get *really* good at system design interviews, and I've recently encountered an issue. There's plenty of content and whitepapers about how consensus algorithms (like raft and paxos) work, but not a ton on how to use them (particularly as a component within a system design interview...). - Grokking's ticketmaster solution either settled for 2PC or kept the problem scale small enough to keep in a single unsharded postgreSQL instance - Alex Xu's 2nd book kinda had raft in one chapter in particular (Digital Wallet), but it was totally unclear what exact data was stored in the nodes or what's really going on in the nodes. There's no db schema in there or anything like that, and it seems like the thing that's running raft is being used as the data store itself (wtf). - DDIA mentioned that zookeeper can be used as an "outsourced" consensus solution (p.375), but of course no diagrams of how that would look within a bigger, non-abstract system diagram What I'm looking for: - I basically just want a diagram where zookeeper is being used as for "outsourced" consensus, similarly to how it was described in DDIA - What is the request and response pattern for this? What thing calls what other thing in what order? - How do I do host count estimates for zookeeper? (e.g. Let's say this application service is handling 100,000 TPS) TC: 370k
what’re sources are you using to study?
Currently: - DDIA (90% done) - both alex xu's books (50%+ skimmed to some extent) - donne martin's system design primer (PDF printed at kinkos) - both the grokking system design courses (pirated & printed at kinkos) - Database Internals by Alex Petrov (super deep, 10% covered, probably don't even need it for google staff interviews) - the google SRE book and workbook Next: - whitepapers - InfoQ, @scale, etc. on youtube - more engineering blogs (tried setting up an RSS reader on my phone, but my initial experience is that half the posts are irrelevant or just not interesting) I do a weekly "stream"/"discussion" of a different system design problem every saturday evening (I somewhat view this as free mock interviews in front of a full panel of engineers) on my discord channel that I've been running for at least a few months: https://discord.gg/WM6BBX7kjx (This particular blind post is created as follow-up from trying to force myself to finally pick up paxos/raft stuff last night...) All my resources above and perhaps a bit more are listed in the #read-me channel. We're probably not as great as Alex Xu or the DDIA guy, but it is a group that is eager to learn about system design and you are more than welcome to join us :)
Hey @re:DESU any relevant chapters to focus on in the google sre books?
Study the code of any system that’s dependent upon zookeeper/etcd it’s relatively straightforward: 1. HBase -> ZK 2. Kafka -> ZK 3. k8s -> etcd That should serve you well enough. If you’re interested in the internals, read the source for ZK or etcd. If you’re interested in estimating latency to perform a round of consensus, then find a sequence diagram of paxos/raft, note the # of RTs, and multiply by the estimated latency between your data centers.
Thanks, I'll try that out. :) I greatly appreciate your input Quick question: what's RTs? is that the number of replicas/nodes running zookeeper or whatever the coordinator service is?
No problem. RTs = Network round trips. The same concept that allows you to show the latency difference between a connection less UDP protocol vs. TCP’s 3-way handshake.
So without knowing all these it seems it's possible to make it to Apple ?
If by "all these", you mean the consensus algorithms, then yeah. I actually briefly touched on 2PC in my interview and scored my current role. That's also "good enough" for Amazon Sr SDEs. Not sure about google L5 though.
Kafka uses Zookeeper for consensus .Solr cluster also uses Zookeeper for consensus. Try to follow tutorials and set up locally on laptop and perform failover AND SEE master step down . Also etcd in Kubernetes also helps to accomplish this . Check for Kubernetes leader election .
Well, I'm really just interested in seeing how tf to use zookeeper for distributed transactions, so I think I might as well just set up a pair of toy postgreSQL instances + zookeeper, and try to implement what I have in the diagram in my OP post at that point
What’s your YoE?
While this is good to understand and I have studied this level of detail for the paxos/raft algorithms in the past, know that this won’t explicitly be asked in system design interviews for senior/staff level. Unless the team/role is specifically in that domain. Let me retrieve some of those links later for you.
I didn't ask this question for hand-waving the problem away as "too deep"; links would be appreciated. I'm currently at the senior level and have a goal of eventually hitting the staff/principal level. This dude detailed the paxos workflow and then did latency estimates during an interview: https://us.teamblind.com/s/4yufM3RY I know he probably went a little above and beyond, but I kinda want to get there as well, so that's what this post is about.
There is a google talk (can’t find link) on how to calc paxos latency times I believe. All he did was basically steal the idea from the talk and pretend he a ton about it. Basically don’t index too much on that.. it pretty came down to.. paxos needs 3 RT connections so to calculate latency just do 3*IntraDC network call…