Capacity estimates during system design

Often you are asked: how much storage would you need per year for this service. And you go through the capacity estimates: 10 M Daily active users * X average things done by users * Y KB of that thing = Z Terabytes of data per year. Ok? So what? What are you supposed to do with that information ? Are you supposed to use that to inform your database choices ? Like if size > X terabytes per year, we need to use oracle/mysql, else use a kv store, else use file system, etc ? If so, what are some rough 'rules' governing that? Otherwise, is capacity estimate just an isolated part of system design, and you don't necessarily need to use that information to dictate other parts of your design ?

@Meta @Google

Add a comment

Sort by :

Twitter HJTx67 Apr 27, 2019

There are various dimensions to capacity estimates. With the storage dimension, it could help you decide if your database server can handle the required storage. For eg: if only 4TB disks can be attached, then you need to have a partitioned db. Similarly estimates for throughput, rps are all necessary in scaling out a system

Chase ghosted! OP Apr 27, 2019

Thanks. What are some rule of thumbs for storage and rps in terms of how they govern your design choices ?

Twitter HJTx67 Apr 27, 2019

It depends. For rps it generally boils down to how many cpus your box have and if it is throttling the cpu. Generally 50k rps for a 16 core cpu is achievable. Similarly for throughput it depends on your network bandwidth, otherwise yo will encounter TCP congestion control if you use TCP or packets will be lost if you use UDP if you go beyond you bandwidth capacity. For storage, it again depends. For eg if you are using a distributed storage engine (DynamoDB, Cassandra ) do you have good distribution of your keys? If you do then the storage per node per day will be your throughput/s/node* 86400 . If you have ttl then you can recycle some storage. If you have uneven traffic pattern then your design is wrong and you need a new schema

Microsoft xVeQ24 Apr 27, 2019

Cogs estimate

Cisco U there Apr 27, 2019

😼

Oracle Whewww Apr 27, 2019

Just make up some calculations and then say that since the volume is large, we need to scale horizontally.

Cisco axephoenix Apr 27, 2019

Lol This ^ 😂

Walmart yvsgi Aug 13, 2020

Why are these storage estimation questions asked to PMs

Sort by :

Twitter HJTx67 Apr 27, 2019

Chase ghosted! OP Apr 27, 2019

Thanks. What are some rule of thumbs for storage and rps in terms of how they govern your design choices ?

Twitter HJTx67 Apr 27, 2019

Microsoft xVeQ24 Apr 27, 2019

Cogs estimate

Cisco U there Apr 27, 2019

😼

Oracle Whewww Apr 27, 2019

Just make up some calculations and then say that since the volume is large, we need to scale horizontally.

Cisco axephoenix Apr 27, 2019

Lol This ^ 😂

Walmart yvsgi Aug 13, 2020

Why are these storage estimation questions asked to PMs

Industries

Job Groups

General Topics

Sponsored

Most Read

Capacity estimates during system design

Most Read