Any advice on how to go from being a data scientist to a data engineer?

Question

I’m not enjoying being a data scientist anymore but excited to learn more about data engineering.  What would help my chances of getting hired given that I’m currently a data scientist? #dataengineer

datageekbd · Answer

Focus on these few things and you will be in a good place for de interviews..

1) SQL specially DML, less on DDL. Start with basics go until complex window functioning with preceding and succeeding rows. Also do recursive ctes.. these are more of the edge questions but no harm in prepping

2) python is very tricky and subjective by the interview. At places like meta, tiktok, amazon a lot of data engineering goes with sql, so python will be light. Specially true for product data engineering.

But for other mid, startup and code demanding roles they might want you to know about Big O and data structures like linked list, stack, queue, recursion etc. but dont spend too much on this additional part, as you might be losing time on less important and most probably wont be asked on this

3) data modeling, bread and butter of any DE. Know dimensional modeling. Read kimball data foundations v3 chapter1,2 if possible. If not atleast learn what is dimensional data modeling, star schema, snowflake schema, pros and cons. See for data modeling examples on YouTube

4) ETL pipeline design: What is a batch process, what is a stream process, how do you design a batch etl job, how do you design a stream etl job, what tools would you use for what situations, what is ETL, what is ELT, what is EtLT, what is CDC. Difference between sql and no sql dbs, what to use when, different file formats like parquet, orc etc.

5) How do you manage data storage, how do you employ indexes, partitions, data distribution, data storage types, compression etc.

6) this is not mandatory but this is good to know. How does hadoop architecture works, what is hdfs, what is hive, what is map reduce, what is spark, how spark architecture works, what is yarn q, how everything works in the hadoop ecosystem etc. will be valuable

7) product sense - as a data scientist you might already know this

Industries

Job Groups

General Topics

Sponsored

Most Read

Most Read

Any advice on how to go from being a data scientist to a data engineer?