No need to go into specifics, just at a high level. Mainly interested in Facebook, Google and Pinterest, but any other internet companies with large amounts of data are free to chime in. Say you want to read some data and gain insights from it. Do you have a central platform where teams can dump their data for everyone in the company to query/process, or are teams responsible for writing and building their own data platforms? Once the data is in the data platform, how is processing done? Are people writing spark apps to read and process the data, or do you have a web UI for that? Is the web UI like a drag and drop thing where you create a DAG of operations, and the processing is done in the background by translating the DAG to spark or Hadoop or something? I'm interested in learning more, because I'm interning in a data heavy team this summer and am interested in how other internet companies approach this problem of making big data available and easy to query and understand. TC: $7725/mo
Tech Industry
Yesterday
873
Best LCOL or MCOL city?
India
Yesterday
1766
Slavery has REVERSED! the US is the slave!!! Check out this dude who pays a personal trainer in India
Tech Industry
Yesterday
2227
I paid 250 for a Google Referral and got Scammed
Tech Industry
Yesterday
1148
Do you really think Amazon is that bad
Health & Wellness
6h
670
How can I find success dating in NYC
Additional questions: Do you build your own platform from open source projects, build on top of cloud-specific services (e.g. BigQuery or RedShift) or just buy software from vendors like Databricks? Why?
I'd imagine the companies I mentioned above build their own. At least Amazon does.
Amazon owns AWS so I’m guessing retail and everything else can just use AWS services for free?