How is a complete system (world-wide?) outage handled in such an efficient manner... Think all servers were up in 30 mins. I've heard of chaos engineering from Netflix, but haven't done much research yet... Any examples of what would cause such events? Bonus: Lastly for the conspiracist... What are the odds of $GME spiking and soon after Reddit crashes? Any possible correlation? If anyone can reference reading materials too that'd be great thanks! #engineering #software
Really speaks volumes about the technical chops of Reddit engineering. Robust system and/or great oncalls. Shooting for bonus: It is likely related, but not in a conspiratorial way. Reddit did not go down because some “high power” tapped them on shoulder asking for it. It [likely] went down because of increased traffic to popular subreddit associated with all the $GME craze. The subreddit has nearly 10 million users. Many likely swarmed it during today’s price surge.
Also, I’m going off complete hunch: The error I got was CDN related. Maybe the surge in GME price caused more traffic to r/wsb. More traffic, so more posting/commenting. More content, so CDNs have stale data. CDN data stale, so people’s content queries ultimately end up on Reddit’s hosts. Surge in content-querying traffic, hosts get overwhelmed. This is a hunch. If true, could be alleviated with sharding requests by subreddit— that way only traffic to “hot” subreddits gets effected.