Tech IndustryAug 23, 2019
HotwireVictor333

Career prospects for Data Engineering?

I find data engineering pretty interesting and fun. You don't need to write complex distributed services or web-apps but mostly have to make use of existing data infrastructure and systems to build pipelines. However, it seems to be a relatively new field and maybe the market pay is also less than backed engineer or other software engineering roles - Data Scientist, Machine Learning Engineer, etc.. Is data engineering looked down upon, in comparison to other swe roles? What are the career prospects?

Add a comment
LinkedIn kVBR87 Aug 23, 2019

From an outsider view, I feel it would be becoming in high demands. At least 1/2 years ago, I feel ppl are diluted in the bubble of DS, and every company regardless how bad their data pipeline is, they need a data scientist. But more and more are realizing before they could start even thinking about hiring DS or Analytics, they need to consolidate their data pipeline. But the pay is hard to say.

Hotwire Victor333 OP Aug 23, 2019

Yeah, the role seems to be getting popular but not sure if it offers growth too. Because, once your build a pipeline, that's it. Not much more to do, than building and maintaining more and more pipelines.

LinkedIn kVBR87 Aug 23, 2019

That’s very true. The growth potential is a bit uncertain or even on the negative end.

Snapchat jduegwozhf Aug 23, 2019

Snap got rid of hiring data engineers and requires Data Scientists to build pipelines using platforms that Data Platform SWEs build

Hotwire Victor333 OP Aug 23, 2019

Why? Did they let go data engineers or asked them to automate their job by building the platform?

New
LangeSohne Aug 23, 2019

They probably did it because the cost of the two distinct roles didn't justify the amount of work each was doing individually. Frankly, this is happening more and more often. A lot of new data scientists think they'll be primarily working on statistics and modeling, but most of their time is usually spent working on data pipeline stuff and cleaning/normalizing data for others.

Humana edjuh Aug 23, 2019

Data Engineering isn’t new. Data has needed to be integrated since data has been being created. There are just a lot of new cool technologies that are overkill for the regular use cases which haven’t changed much

This comment was deleted by the original commenter.
Hotwire Victor333 OP Aug 23, 2019

Why?

New
LangeSohne Aug 23, 2019

Because a data scientist is (contrary to popular belief) typically not working on hard statistical modeling that will directly inform product development. They manipulate and massage data and they also (increasingly) work on maintaining and tweaking the data pipeline. But they're strictly less versatile than most software engineers at a large tech company. They're not working on machine learning models or doing research, or doing anything much more advanced than timeseries analysis and linear regression. So they end up being more of a cost center than generic software engineers and have less growth potential overall. This is why (for example) most teams of data scientists at Facebook are paid less than software engineering teams at the same level.

SolarWinds AnEngineer Aug 23, 2019

In some places a DE just works with SQL and some Python and is closer to the Data Science team. At other places (such as my current employer) the Data Engineers are really distributed systems engineers and the position is for the most senior of SWE (with more junior folks working in front and back end of the web apps). It's really not a well defined title.

Hotwire Victor333 OP Aug 23, 2019

Yeah, the role can vary. Concurrent processing on distributed systems at scale with robust fault tolerance is perhaps the juice of it.

Salesforce sdes Aug 23, 2019

I work in data engineering. The real skill is to build a platform that scales and is highly metadata driven. For example I have built platforms with java, python, hive, oozie, spark and Postgres. We have designed each component in such a way that it can be generalized for all pipelines. Once the platform is built it’s all about writing ETL scripts and workflows(oozie,airflow etc)