FB Data Engineering Interview

Dec 10, 2019 8 Comments

Just curious - why are they asking Leetcode questions for a non-SWE data engineering role?

I don’t think I’ve ever used a linked list in regards to building a data pipeline, nor do I play with stacks.

Any insight? Looking for some prep material.



Want to comment? LOG IN or SIGN UP
TOP 8 Comments
  • Spotify nCnR52
    There are people who think data engineering is statistics and ask data science and stats questions.
    Then there are people who think it is engineering. So they ask swe questions.
    Then there are companies who think this is functional programming, so they ask category theory.

    It’s funny how 90% of data engineering is actually SQL. Just plain simple sql. Nothing else.
    Dec 10, 2019 2
    • Cisco fieiwnsow
      Yeah, I agree. I mean I’m fine with learning some of the concepts, but it’s stupid to do it just for the interview haha I can do plenty of stuff with SQL, Python, and Bash. That can do 99% of ETL stuff (and yeah, SQL is most of it).

      I just don’t know of actual work that uses stacks or linked lists or anything like that from an ETL/reporting pipeline standpoint. I understand it from the SWE angle of course
      Dec 10, 2019
    • Spotify nCnR52
      Yeah, I think there are a few common problems where you need a little more info.
      Distributed systems and making a problem statement overly complicated and then needing a ton of unnecessary shit to solve it.

      Fuzzy matching, entity resolution, nearest locations in geo spatial data and some other weird cross join scenarios are the only ones that I consider DE but SQL is not always enough. But I have never seen any interviews ask these.

      Most companies ask weird shit in interviews to raise the hiring bar. You have to play by the rules, because that’s the only thing that exists.
      Dec 10, 2019
  • Amazon mcQw31
    They’re all LC-easy from what I remember. I took an Amazon DE position over FB because Amazon’s interview was more DE-centric so I thought this place had its shit more put together. A mistake in hindsight
    Dec 10, 2019 2
    • Axtria BezoKaBaap
      Why a mistake?
      Dec 10, 2019
    • Amazon mcQw31
      The internal tooling here is atrocious. AWS services are fine, but all other homegrown stuff (including the primary data lake and ETL tool) is non performant. You’re stuck with it though, as most teams will ONLY share data that way. A lot of days you’re working around tool and service availability, and it frankly just feels like shit. Silver lining is that I get to play with AWS services on the side, which obviously is great for when I transfer out in a few weeks
      Dec 10, 2019
  • Spotify nCnR52
    So yeah, to conclude I do understand the need to know graphs and other algorithms for DE, but you can’t ask all, so you end up asking LinkedList. But then few DEs are actually SWEs
    Dec 10, 2019 1
    • Philips new098754
      OP can you refer me at Spotify?
      Dec 11, 2019


    Real time salary information from verified employees