Tech Industry
Yesterday
1058
Brother beaten severely as a kid. Doesn’t speak to dad at all now.
Software Engineering Career
Yesterday
3738
TechLead: Why so many layoffs
Health & Wellness
Yesterday
1821
I hate my f***** life
World Conflicts
Yesterday
405
Remember folks, all Israel wants is the hostages back
Tech Industry
3h
427
A good salary for Seattle
I’m not enjoying being a data scientist anymore but excited to learn more about data engineering. What would help my chances of getting hired given that I’m currently a data scientist? #dataengineer
Focus on these few things and you will be in a good place for de interviews.. 1) SQL specially DML, less on DDL. Start with basics go until complex window functioning with preceding and succeeding rows. Also do recursive ctes.. these are more of the edge questions but no harm in prepping 2) python is very tricky and subjective by the interview. At places like meta, tiktok, amazon a lot of data engineering goes with sql, so python will be light. Specially true for product data engineering. But for other mid, startup and code demanding roles they might want you to know about Big O and data structures like linked list, stack, queue, recursion etc. but dont spend too much on this additional part, as you might be losing time on less important and most probably wont be asked on this 3) data modeling, bread and butter of any DE. Know dimensional modeling. Read kimball data foundations v3 chapter1,2 if possible. If not atleast learn what is dimensional data modeling, star schema, snowflake schema, pros and cons. See for data modeling examples on YouTube 4) ETL pipeline design: What is a batch process, what is a stream process, how do you design a batch etl job, how do you design a stream etl job, what tools would you use for what situations, what is ETL, what is ELT, what is EtLT, what is CDC. Difference between sql and no sql dbs, what to use when, different file formats like parquet, orc etc. 5) How do you manage data storage, how do you employ indexes, partitions, data distribution, data storage types, compression etc. 6) this is not mandatory but this is good to know. How does hadoop architecture works, what is hdfs, what is hive, what is map reduce, what is spark, how spark architecture works, what is yarn q, how everything works in the hadoop ecosystem etc. will be valuable 7) product sense - as a data scientist you might already know this