Back in the old days only Oracle & MSFT was the major player in database tech. interested to know which data-engineering/management tech to focus now esp with outburst of so many niche players and has made their mark like MongoDB, Snowflake, Hadoop, AWS, GCP etc. Please share your thougts and also adapt other tech to compliment it more? Exp - 10+ years
Google Cloud Dataflow is widely used at G for MR pipelines. Its open source API is Apache Beam. Before joining Google I had no idea it existed since I was focused on AWS offerings.
Python,sql,airflow and Spark. However these things change every few years
ClickHouse
I think more than individual pieces of technology it’s more important to understand the idea behind these systems and what trade offs each system has made over the other. If you have a good understanding of these concepts, then picking up an implementation is not much harder.