Event data modeling (likes of Instagram, Uber, facebook)

I’m preparing for meta on-site and trying to come up with data models for scenarios mentioned in the subject. I certainly feel a regular dimensional model is not going to help w.r.t inserts/updates easily. Can any one share some good examples? Any ideas on spider schema ? #meta #dataengineerinterview #eventdriven #eventdatamodel #tech

@Meta

Sort by :

Tesla WlqM10 Mar 8, 2022

Needs this too.

Apple nDRp80 Mar 8, 2022

It usually deals with creating minutely aggregates , so your millions of events per second turn into total number of unique entities liked in that minute which would be in 1000s, that’s how you minimise the number of writes. Now the 1000s of minutely aggregates can be broadcasted over datacenters to achieve global consistency after a minute, or you could do database replication. The raw events can be persisted at their own cadence in a nosql with fast writes capabilities

Exusia nDXp27 OP Mar 8, 2022

Thanks for the reply. What you described makes sense for operational efficiency. My use case is more geared towards analytical data models. How efficiently we can store them. I’m looking for a simple example like how a normalized source system level information capture can be transformed into a target schema. Anyway, I get to know something from your reply. Thank you.

Apple nDRp80 Mar 8, 2022

Given the scale of data that is flowing in, for analytical use cases you have to define your dimensions and cardinality of each of the dimensions to come up with a data schema , say you want to know across which cities , gender, category and age group have most pictures uploaded, then your dimensions become the things you want to query upon and cardinality becomes the numerous values the dimensions can take. Again this is not flexible as you need to come up with the dimensions well in advance but that’s the trade off you have to make

Apple nDRp80 Mar 8, 2022

We have OLAP stores like Druid and Pinot to address that use case . You may want to take a look in there. Also a lot depends on if you want real-time querying capability vs batch ones

Sort by :

Tesla WlqM10 Mar 8, 2022

Needs this too.

Apple nDRp80 Mar 8, 2022

Exusia nDXp27 OP Mar 8, 2022

Apple nDRp80 Mar 8, 2022

We have OLAP stores like Druid and Pinot to address that use case . You may want to take a look in there. Also a lot depends on if you want real-time querying capability vs batch ones

Industries

Job Groups

General Topics

Sponsored

Most Read

Event data modeling (likes of Instagram, Uber, facebook)

Most Read