I was reading grokking for Dropbox and there they mention that the file is broken into 4mb chunks and sha256 of that is calculated to check if that chunk is already present on server for deduplication.
I was wondering if two chunks can have same sha256 then how is this reliable ?
Want to see the real deal?
More inside scoop? View in App
More inside scoop? View in App
blind
SUPPORT
FOLLOW US
DOWNLOAD THE APP:
FOLLOWING
Industries
Job Groups
- Software Engineering
- Product Management
- Information Technology
- Data Science & Analytics
- Management Consulting
- Hardware Engineering
- Design
- Sales
- Security
- Investment Banking & Sell Side
- Marketing
- Private Equity & Buy Side
- Corporate Finance
- Supply Chain
- Business Development
- Human Resources
- Operations
- Legal
- Admin
- Customer Service
- Communications
Return to Office
Work From Home
COVID-19
Layoffs
Investments & Money
Work Visa
Housing
Referrals
Job Openings
Startups
Office Life
Mental Health
HR Issues
Blockchain & Crypto
Fitness & Nutrition
Travel
Health Care & Insurance
Tax
Hobbies & Entertainment
Working Parents
Food & Dining
IPO
Side Jobs
Show more
SUPPORT
FOLLOW US
DOWNLOAD THE APP:
comments
A better approach maybe is to make immutable chunk and compare chunk-ids...
It says that when a user uploads a file it calculates it sha256 and check if it's present or not.
If it's present they you don't need to store the file (assuming same hash means same chunk which is not true) actually since it's already present.
So you need to have two different chunk I'd in Metadata but they both point to the same chunk in object store.