Does anyone has any links/docs/vids/books reference which i can follow to understand about "scaling" system or to build architecture that can handle very large scale of data. Lets say, what would be my stack if i want 3M hits per seconds Can store petabytes of data Can search and return results in millis Is highly secured Please don't suggest using aws, gcp, azure etc. What if i want to reach here with only open source tech stacks.
Are you really working at Amazon? Seriously you expect an answer with just this details? What about Data Model, Consistency Gaurantees, Write/Read patterns? And many more such relevant details? We can suggest you not to use any of the publically available cloud but that would mean you now would have to solve problems like Compute/Resource management, Autoscaling, Zones etc.
Time for everyone to follow promotion oriented architecture and write their own db based on the same fucking paper from 20 years ago.
2024 Tax
Yesterday
4142
Biden’s new tax proposal is wild
Tech Industry
Yesterday
814
Chances of meta clearing E5 with screwing up one coding one round and acing all other
2024 Presidential Election
Yesterday
2694
Biden ruined America and tech! Tax plans are insane
AMA
Yesterday
1045
Indian Gay Guy AMA
Tech Industry
Yesterday
795
Worth taking a risk on late stage company?
For that type of performance, you’ll need a cache; I like Redis for caching. Nginx as a proxy server in front, or Apache if that’s your thing. For search, something like ElasticSearch, Lucene, or Solr. For data storage, there’s sharded PostgreSQL or MySQL if you’re in the traditional RDBS camp, CockroachDB for a more spanner-esque approach, and then the plethora of NoSQL stores like Cassandra and others. You can use these with a system like Kafka in between to push data between systems. You’ll also need some type of server to serve up the response. Go might be a good choice if you’re in need of performance. But the reality is that every system scales differently and has different pain points, depending on how important it is that you have ACID compliance to how you manage connections (i.e. WebSockets or long polling or a bunch of tiny requests) to the size of the data you’re serving (video? Log streaming? ML training sets?).