-
The bias isn't in the people who choose the training data, it's in the choices made that make the current status quo and thus decide the training data.
Eg, if I pick training data of all of our staff with 3.5/5 or above, and up until now HR have been biased in picking old, white men, then that will come through in the training data.
How you avoid that bias is removing ethnicity, age, and sex as parameters in the training data, but that should be fucking obvious to any data scientist with even the slightest hint of commercial awareness.
-
Yep, no disagreement with that. I read @branwen 's "codify and extend biasas of the people who write that code" to indicate that personal bias on the part of the coder was involved.
I don't think it has much to do with the bias of the coder, I understand that most issues currently come from the AI looking to replicate the current situation in terms of successful applicants so just continue to feed square white men into the interview pipeline.