So what should I learn ? . I started my career as an informatica developer 2 years back and been working with different contracts mostly around informatica cloud ,said,ssrs and tsql . I want to move more in depth into data engineering and I have no clue what to learn ? Could someone help out with ideas . I have heard people suggest learning gcp or aws . What would you guys suggest ?
First off, smoke a ton of weed. The best advances in parallel and distributed computing have been from people when they're are at their most baked. If you can't do that for religious or personal reasons, data engineering probably isn't going to be the right path for you. Second off, dig into spark and some other common ETL frameworks and get a feel for writing programs on EMR and on virtual instances. You can do really small/low budget stuff if you keep your sizes tiny, but use a couple machines at least. Do a few parallelized ML algorithms and consider reading the spark data engineer book. Visualize some of the stuff you do. Build a tiny GitHub resume of spark apps made for funsies. After that, you should just focus on getting your piss clear from the weed so you can try your hand at a couple of interviews.
Learn a programming language, that's needed for post cloud ETL
Tech Industry
Yesterday
558
The new Tesla Model 3 P goes from 0-60 in 2.9 seconds
India
Yesterday
399
How to save India from destruction?
Tech Industry
Yesterday
1720
TESLA UP 14% AFTER HOURS ππππ
Tech Industry
Yesterday
337
Will India/APAC be the new Silicon Valley?
Tech Industry
Yesterday
4119
11 offers to laid off[UPDATE]: 5 offers
Learn a programming language and try to implement the ETL modules yourself, try a different architecture like ELT or Streaming, dimensional modeling, analytical work, data visualization. The field is massive.