Just curious - why are they asking Leetcode questions for a non-SWE data engineering role? I don’t think I’ve ever used a linked list in regards to building a data pipeline, nor do I play with stacks. Any insight? Looking for some prep material. Thanks!
There are people who think data engineering is statistics and ask data science and stats questions. Then there are people who think it is engineering. So they ask swe questions. Then there are companies who think this is functional programming, so they ask category theory. It’s funny how 90% of data engineering is actually SQL. Just plain simple sql. Nothing else.
Yeah, I agree. I mean I’m fine with learning some of the concepts, but it’s stupid to do it just for the interview haha I can do plenty of stuff with SQL, Python, and Bash. That can do 99% of ETL stuff (and yeah, SQL is most of it). I just don’t know of actual work that uses stacks or linked lists or anything like that from an ETL/reporting pipeline standpoint. I understand it from the SWE angle of course
Yeah, I think there are a few common problems where you need a little more info. Distributed systems and making a problem statement overly complicated and then needing a ton of unnecessary shit to solve it. Fuzzy matching, entity resolution, nearest locations in geo spatial data and some other weird cross join scenarios are the only ones that I consider DE but SQL is not always enough. But I have never seen any interviews ask these. Most companies ask weird shit in interviews to raise the hiring bar. You have to play by the rules, because that’s the only thing that exists.
Tech Industry
Yesterday
6603
Google doing more layoffs, restructuring including country moves
Health & Wellness
Yesterday
5891
Why are women naked in gym?
India
Yesterday
1693
Lost respect for Modiji
2024 Presidential Election
Yesterday
1530
Biden ruined America and tech! Tax plans are insane
2024 Tax
Yesterday
2669
Biden’s new tax proposal is wild
They’re all LC-easy from what I remember. I took an Amazon DE position over FB because Amazon’s interview was more DE-centric so I thought this place had its shit more put together. A mistake in hindsight
Why a mistake?
The internal tooling here is atrocious. AWS services are fine, but all other homegrown stuff (including the primary data lake and ETL tool) is non performant. You’re stuck with it though, as most teams will ONLY share data that way. A lot of days you’re working around tool and service availability, and it frankly just feels like shit. Silver lining is that I get to play with AWS services on the side, which obviously is great for when I transfer out in a few weeks