India
4h
409
Why Worshipping Lord Ram Important in Hinduism?
Personal Finance
Yesterday
1229
IRS Warns Thousands of Taxpayers They Could Face Jail Time
Tech Industry
2d
4749
Job market is brutal for SWEs 🥲
Tech Industry
1h
255
Nooglers should be ashamed. Google value will reduce in arranged marriage market unless Sundar does a miracle
I've been confused when to switch to a nosql db from a relational db, in terms of performance requirements ie qps etc This pertains to system design interviews but can apply more generally. Assume no joins, or transactional guarantees are needed. Edit: assume that query patterns can be supported on either option. Edit2: assume we want to build Facebook/twitter news feed as the problem statement.
It depends 🤷♂️
OP its not about level of scaling, but acid vs base characteristic & database design (relational vs non-relational) also drives the decision, there are plenty of good content available on the net !
Question makes no sense. If you don't need transactional guarantees and joins then why are you using RDBMS in the first place? NoSql is not the only option if you scale. It depends on your requirements. Try figuring out your access patterns and the overall structure/layout of data and then make an informed decision about sql vs NoSql.
then the question is about indexes and schema(flexible or not). postgres comes with both benefits.
You are missing the point. It comes down to your data. If you don't need joins, full acid, and your data doesn't lend itself to being defineable by a schema then use nosql. Both solutions can give you solid performance.
there's always a schema. you don't get to choose that. you only choose if you want your database engine to be aware of it, or to leave it implicit in how your producers and consumers use the data. you can choose not to think about it and pretend it isn't there, but there's always a schema.
Are you scaling vertically or horizontally? What are your use cases and business needs?
No relation to OP’s question, but Isn’t it easier to get solid performance from NOSQL (simply add nodes) compared to RDBMS ? Or is it just similar - just add more read replicas for RDBMS ?
Maybe. NOSQ is simpler and likely faster for simple key-value look ups. But often successful system tend to survive and evolve with more requirements and you suddenly find yourself wishing for something beyond simple key-value look ups. Also sometimes an org has operational expertise in say mysql and they can tune a single mysql shard better than some NOSQL shard. Also unclear the dollar cost savings per CPU core by adopting NOSQL matters much to such an org.
How are you at Google?
If you want to do complex joins, a sql db could likely be useful and more future proof (unless you enjoy rolling your own join and locking logic). With the appropriate partitioning and work distribution layer above it, either sql or no-sql is fine. Plenty of large companies go for the sql path, lots of people use vitess on top of mysql, fb uses tao above mysql, pinterest uses some static partitioning on top of mysql, instagram used something similar on top of postgres for user and photos metadata. List goes on.
When would you ever prefer nosql, since relational dbs partitioned appropriately would work anyway and support more diverse use cases. You could do application side joins for large scale apps but that's an option either way
Cost, simplicity, smaller set of failure cases and possibly reduced latency. If you are absolutely sure you will never need the full toolset offered by mysql, then it’s a good option. For example, lots of folks uses redis because it is fast and simple. Could mysql give you the same things, yes but it may not do it as cheaply per cpu core as redis. But maybe you don’t care about cost and latency.
That’s a common misconception. You can scale sql and nosql to the highest levels. It’s about the trade offs between the two
what would be the trade off? we would have to make our schema be relational?
Is your data model constantly evolving and changing? Or are your constraints mature and static? Do you need to partition this in other regions, if so do you care about additional operational burden of running something like Vitess for horizontal sharding. Do you need data consistent as soon as it changes across all partitions or can you wait a few minutes? TLDR: cap theorem