Software Engineering Career
12h
2351
L4 Google -> 45 interviews, 5 offers, AMA
Tech Industry
6h
731
The man I love hates me because I’m Vietnamese
Tech Industry
19h
1718
Why doesn't OpenAI offshore and reduce expense by 80%
Tech Industry
5h
801
Question about women in their 30’s?
Tech Industry
3d
41495
What happens when most of your team is Indian?
https://engineering.instagram.com/dismissing-python-garbage-collection-at-instagram-4dca40b29172 they disabled GC (via horrible hacks in Python) to slightly increase performance, then when process is out of memory it just crashes and they spin up a new one... maybe I am too old or don't know Python at all... but this looks like a piece of shit not an engineering solution, some adhock hack. And the way the did it was via some blind production experiments... I would expect something like this from student who just started studying CS, but in FB for the whole Instagram... and they published this in their blog... I don't get it... is it really OK?
disagree. skimmed article. sounds like they found an inefficiency that causes copy on read in multi process environment (root caused by GC) and were able to avoid it generating a 10℅ increase in performance. that is a huge increase, especially when you consider the scale, and thus the cost savings. there's no real downside to this because it just means workers die after some time and are replaced. it's therefore quite clean at the higher level. disabling GC is not uncommon. in transaction based systems where throughout is important and/or where you can't afford random delays, GC pause is a really bad thing. A good edge example is trading systems (very low latency). while typically written in c++ (I.e. no GC, full control), sometimes they are written in Java. by writing the code in such a way that you rarely need GC, and then disabling it, you gain a lot (random.delays can be very costly!!). also, replacing a process with a fresh instance rather than taking GC hit can be more efficient, and is very cheap in a stateless system. of course you wouldn't do this in a run of the mill system, but it is absolutely a good solution in some cases.
Smart. Its a rarely known but smart trick in languages with garbage collection. And a basic principle in biology and system engineering. Also known as "Ship of Theseus" Some of the most stable system are built from parts that replace themselves when they fail. Biological organisms work that way too. Your cells get replaced when they fail. That's the beauty of a restful web server, it can work the same way as a human body. The processes that serve requests die whenever but the server appears to live on.
For me it was funny that they actually published that they "googled" for answers.
While this is a huge win for IG, it's shit like this optimizes the tech into local maxes. Also it's very common that things like this happen without any long term vision for how to address to core problem, Python is slow.
Other python based companies have done this too
Oh yeah, this is crap. You should never do it or save anything. Just use the 10 or whatever percent for what they meant for and it's all good. 😀
Long live process per request model
I was thinking the same thing.
They disabled GC, not all memory management. If you don't write code with loops in the object graph, you're fine without GC in Python.
I agree, this sounds stupid as hell.
no it's not. Read comments below. This architecture forces you to write fault tolerant code to withstand crashed processes. Think of acid principles applies to request processing.
why not just write fault tolerant code without adding additional faults? the article begins with "we fork a Python process dozens of times, and it makes things slow and uses a bunch of memory." then it's "here's a hack to fix something. here's what the hack broke" repeated until the broken bit is acceptable (for now) it all sounds like exactly the kind of hack that will come back later to bite you in the ass.