Don't blame the model. The training data is biased.
It’s not always biased training data either. Sometimes the data is representative, but just not what we want it to be.
There aren’t many female software engineers. It’s sad but it’s also true. And you can’t blame tech companies for it either. Sure, there are some sexist people in tech, but it’s pretty rare. The real reason there aren’t many women in tech....is because there aren’t many women GOING into tech. It’s that simple. Since there’s a lot fewer women in tech, there’s a lot less training data for these models to train with. There are standard techniques you can use with categorical imbalance, but in this particular case, I think the entire premise of scanning resumes with ML is ludicrous. It was always destined to fail
It wasn’t intentional. The training data introduced the bias. It’s foolish to expect AI to be somehow perfectly balanced when creators of it are not
Aaaand it took no time at all for the dumb boy nerds to jump on the idiot train.
At least it's something the general public can understand. It's not as embarrassing as some of the other SNAFUs: Microsoft's Twitter bot didn't take long to learn and spout racist tweets. Google AI once classified black people in images as gorillas.
Microsoft Tay is, by far, the most hilarious thing to ever happen. Still brings me to tears 😂
The dataset had a high abundance of male applicants. But the developers should have handled this and not let the machine value females lower. Case in point, the team that developed this has been dissolved.
Was there any real evidence that the model valued females lower? I know the model valued “women’s studies” majors lower, but seriously. That’s not a very useful major for engineering. Or well... anything. That doesn’t make the model sexist. There are many women who study majors such as CS.
Precisely. Another stupid article. Especially if they removed names from resume and they should, so the screener can't use 'karen' as a feature for example, there is no way the ML algo can be biased. Of course it will still pick up more guys, since the distribution of features (majors at school) will be different based on gender simply because not enough women do the right major. If anything they can generate new test data and feed it male/female name and prove that algo is not biased.
If the rejection rate for males and females in the training data was the same it's unlikely the model would use (inferred) gender as a strong signal. If the rate was different, then it would obviously use it.
They should have hard-coded quotas into the system, just like companies have to do now.
Should hard code auto reject on trumptard applicants
We are so sorry you could not Crack the interview