there are many open source tools and paid tools out there to get the job done just a little bit of wrapper here and there should suffice as long as the right tools are chosen and properly architected to the use-cases I get it, big tech's approach to infrastructure is via software engineering approach they want to build their own tools, build their own platforms, some even build their own cloud they re-invent the wheel to implement some feature or to speed some stuffs up only to need to continue to maintain that tool, and then they build another tool and another tool and yet another tool, re-inventing the wheel in the name of we are SWE-SRE and we take product approach who will be maintaining all these tools? what is life-span of these tools? what is cost of these tools? imagine an engineering team of group of engineering teams building tools to reduce or measure costs but cant build a tool to meausre the cost of re-inventing the wheel and cost of building all these tools and not able to maintain or manage them if you work for Uber SRE, doordash SRE, twitter SRE(probably reduced close to zero now), dropbox SRE, insert any FAANG-like/wanabe SRE here mind sharing your thoughts about decision making on your SRE team or org? do you agree with decisions they make at times? do you like the approach they use in solving infrastructure engineering problems? comments to discuss and it is ok to agree to disagree TC $280k YOE 12 #uber #doordash #twitter #dropbox #sre
Not SRE, but from my experience - New tools are built at Big Tech because existing open source tools cannot work at the scale these companies operate and we need to replace them often because existing tools do not catchup with the rate of growth of these companies.
this re-inventing the wheel approach is why engineering teams are way over-staffed because it cost so much money to run and maintain this re-inventing the wheel culture just take a loot at twitter recently, this is why 75% of twitter could be cut and they can still run the company, but i can bet my money they will end the re-inventing the wheel culture right away to continue to run at such low number only tools that make sense and needs to be made will be done nut just making craps that never even make it to production or touch anything half being used
The Twitter example you've quoted is a bad one. No one needs the fire dept until a fire accident has occurred. The same goes for Twitter.
SRE teams at Google are the worst when it comes to aligning on a single decision. Even if the decision was made directly by the CEO - it does not matter. SRE teams takes pride in questioning. 90% of time the question is naive, without context and only demonstrates individualistic preference- but it is part of the culture and very well received by tenured SREs. New SREs are almost always caught off guard and feel threatened. To your second part about Infrastructure as Code, it is important to understand Google/Facebook/Microsoft pay their employees good amount as they consider humans as their raw material. Most banks or other companies who rely on tools are making the software vendors rich while paying peanuts to their employees. Just look at the license cost of Splunk and you will know why you are paid so less. There is no quality SRE tools stack in open source world. Google SRE has a list of 50+ tools which exist. Building all the stack using just open source tools is going to create a horrible mess of disjointed components which will need a full fledged SRE team by itself .
have you thought of the cost of maintaining this culture? ever occurred to you to measure the cost? how many of these tools have never been used? how many really make it to production? how many have a production use of more than 2 years?
All of them. At any time , SRE teams at Google uses more than 50 tools in production. Only few of them have open source counterpart. Vast majority of them are not even know outside. SREs core existence relies on optimising systems. Systems are continuously built and deprecated. If you don’t do this , you are saying Tool A is the utopia of function A and nothing could be better. With continuous development of new ideas, one out of 10 will be ground breaking and change the world.
The scale of the problem leads to a whole new set of problems that others could hardly imagine. Security and privacy requirements with infra tools are much stricter, especially for Faangs. Not reinventing the wheel only makes sense in a limited context, but for example, micro-services, it just increases the amount of coupling, and could create spof if not careful.
i get re-inventing the wheel when there is real need to and plan for the future of that tool in terms of maintainability issue is when there so many tools and the cost to manage and maintain them becomes so expensive such that the purpose tool was built becomes useless and outdated a tool is only as good as it is being used in real life, and maintained and taken care of during its use
The biggest problem I’ve seen of buying instead of building a lot of times is you’re still paying an SRE to maintain the tool you bought, configure it, etc. And so you either have a product you don’t like and can’t control or for the same cost a product you can control and, when you hire top end talent, should in theory turn out pretty good in most cases
for the same cost you build the tool? so you think you can build a logging platform with less money to buy splunk enterprise? just one example one, splunk enterprise is managed by splunk and they worry about maintaining it imagine how many tools and services an engineering team will build to provide logging, they want to build new tools for every component of log searching, they may even re-invent a new timeseries database, re-invent a new blob store for long time object storage, re-invent a better searching algorithm imagine how many engineers will need to be paid $500k per year for this and how they need to maintain and mange that do you calculate cost of this culture at all???
Do I really need every single element of splunk and every integration possible? Or just some of it? They may maintain it, but who’s going to integrate it and maintain the integration? That’s a big cost. Who will manage the relationship with splunk? And will it be just one person? What happens when splunk jacks up the price because otherwise you’ve got nothing (as many companies do). What happens when you have no in house knowledge of splunk and they tell you that you need product X when you don’t? What happens when something core to your business doesn’t integrate with splunk and splunk says they don’t give a fuck? And if splunk is maintaining it, they’re also paying many engineers 500k to manage it.
??
Reinventing the wheel is a core part of building a business how else would innovations happen? Honestly to me this is a dumb question explain how you’re gonna get someone to pay for something different and more expensive. Yeah well sir my car has wooden spoke wheels. Those new fangled rubber tires are just a meme.
true innovations happen from re-inventing the wheel, not going to disagree at all golang came out of google, react out of facebook and so on i get it, but like i said it starts to get abused a bit where everyone wants to build some new service rather than add to an existing what am focusing on is the abuse and the bad decision making of it i mean when these companies are doing mass layoffs to finally wake up to reality when the FED isn't printing money into oblivion anymore, then are we really going to get mad at them?
It just happens organically once systems start being layered on top of each other. You have a monitoring system that produces too many metrics for an off the shelf tsdb to deal with so you write your own. Now you write a spiffy alerting system for it, now a notification system, etc. Nothing commercial would tie into any of it without heavy and expensive customization. Only downside I see is promo based design where people don't adopt other existing tooling inside the company because it doesn't quite fit the use case and contributing code to the existing project isn't as sexy as making a new tool. If your company can't afford to write internal tooling go ahead and use open source or commercial. If your scale and integration needs aren't high it will likely work out.
Its all a function of your revenue and profits. There are no hard engineering rules or practices for these which apply to all companies and all stages.
when companies get bigger this usually happens anyone at big companies know about waste of money, work on projects that never see the light of the day all that can actually be used to focus on what are needed and improving what currently exist i have been there and we have all being there but like i said as 2023 and later comes, where times get tougher, companies will reduce this waste culture aka twitter, and then what next? are we going to go back to it when the market booms again?
The only sure waste in Twitter deal was paying 44B for it. The other decisions are yet to show their results.
It depends in each individual use case, we do use external or open source tools wherever necessary. But there comes a point where building and maintaining the wrapper becomes as big a task as building a new service and maintaining it
agreed, but without being conscious of it, when that times come there should be guard rails to make sure there is no abuse of it in the name of senior engineers wanting to be staff/principal where they have to be biased towards re-inventing so they can get that promotion or perform at their level i think there should be less pressure on staff/principal to always re-inventing, achieving 4 and 5 9s should be praised also, reducing downtime to near zero and so on, reducing how many build fail in pipelines and stuffs like that should be praised also, not just working on building a new distributed key-value store in the name of we need one specific feature doordash from what i have seen from engineering blogs, is doing ok but i am afraid as the team gets bigger then may start to abuse it like others do
Tech Industry
Yesterday
1012
How to be content in life?
Tech Industry
2h
4198
China CYBERATTACK on UK ? WTF
Ask Blinders
21h
938
Why no one cares about the lives lost in Gaza, Israel and busy in their own lives?
Software Engineering Career
13h
1007
Why does leetcode get so much hate?
World Conflicts
5h
476
Screw it. Don't care anymore. Let Israel take it. One state solution.
I think this also has to do with gaining better control over the tools being used. Imagine using a tool like circle-Ci which is working just fine but they don't want to innovate further. Asking them to add features or provide upgrades is a pain. I remember we are migrating off a third party tool, who have decided to shut business 😜, because they have pivoted their business model. Specially, for FAANG like amazon it won't make sense to use a tool developed by a company that uses aws. Because Outside world pays the real Aws costs. Their subscription charges reflect these costs. Internally aws is subsisidised by 50-80 percent. So this reason, plus the control of maintainance would make sense to build a tool instead of using one available.
i dont have much problem with building tools from scratch but i think it should be a last resort and not the first go-to the issue is this re-inventing the wheel culture promotes go-to first approach and they even promote "dont be afraid to take risk" which is what that means, everyone need to ship something and then what happens, too many stuffs to maintain, rather than staying focus to managing core tools that are actually being used