The tech headlines in June were all about the inaccuracy of facial recognition software, if you’re not a white male.¹ For white males, facial recognition for gender was accurate 99% of the time; the accuracy of facial recognition for gender of dark skinned females was 65%. Why is this happening in 2018? We know how, the data sets used for facial recognition were 80% male and 75% white. Facial recognition technology is proposed for passports, security systems, etc. So, why would US based companies, with a demographic of 49% male and 51% female and the racial mix is 63% white and 37% non white use poor data?
Garbage in, garbage out is an apt description. It applies not just to computing and technology. A builder using poor materials and shoddy workmanship is going to give you a poor building. People unprepared and ill suited for a job give you disappointing results. It is not going to work. You need good data, good models for solutions to work. Business decisions made from bad data cost over 3.1 trillion dollars a year.² Good models and good data are good sense. So why don’t we do it? Why don’t we recognize it, when it is literally staring us in the face?
The current US congress, the most diverse congress ever, is 80% white and 80% male. These are the decision makers, the policy influencers, this is the governing body for a country that is 49% male and 51% female with a racial mix that is 63% white and 37% non white. From a data perspective, the results from this governing body do not bode well for the population as a whole. In March of 2015, the US Census Bureau reported that by 2020 more than half of the country’s children will be minority race, and that this shift will take place for the population as a whole in 2044.
History has shown us out of alignment congressional and judicial models costs. There is always a group that pays starting with the seizure of land from indigenous people, and continuing with enslaved people. This misalignment made laws to ban Chinese immigration, imprisoned and seize property from American citizens of Japanese heritage. This misalignment made it illegal for women to vote. The congressional data model is skewed. By numbers alone, a demographic shift should be reflected in the legislature with a slight lag. Historically, that has not happened. The 19th amendment for women’s right to vote went to congress in 1918 and was ratified two years later. Almost a hundred years later in 2018, congress is 80% men in a country that is 51% women.
It’s 2018, and the software design for facial recognition in the US literally does not recognize dark skin females because the developers used a dataset of 80% male and 75% white. With opportunities for Supreme Court justices, where is the conversation for the makeup of the court to track somewhat with demographics? The US Congress currently 80% white and 80% male for a country that is 63% white and 49% male. Doesn’t critical thinking sound the alarm? Does this not parallel the facial recognition dilemma? Joy Buolamwini, a Rhodes scholar at MIT Media labs found that facial recognition systems did not work as well for her as it did for others.
“Technology”, Ms. Buolamwini said, “should be more attuned to the people who use it and the people it’s used on. You can’t have ethical A.I. that’s not inclusive,” she said. “And whoever is creating the technology is setting the standards.”
Moving forward, can the same be said of government? Government should be more attuned to the people with use it and the people it is used on you can’t have ethical government that’s not inclusive, and whoever is creating the laws is setting the standards. If those creating technologies³ continue to fail at diversity until called out, is government any different? I can’t help but ponder while the numbers may be for us, is the force against us?
¹facial recognition for gender
- light skinned males – 1% error
- light skinned females – 7% error
- dark skin males – 12% error
- dark skin females – 35 % error
²Erroneous decisions made from bad data are not only inconvenient, but also extremely costly. IBM looked at poor data quality costs in the United States and estimated that decisions made from bad data cost the US economy roughly $3.1 trillion dollars each year.
³ https://bousbous.com/2018/08/13/the-diversity-issue/, https://bousbous.com/2018/03/05/2983/
Interesting points. I think that another factor to bear in mind are the people who volunteer to test it. It seems like for some reason the majority of volunteers in the science community, not just technical but also as medical testers, are Caucasian or near to it (I don’t have any data on hand to prove this, so I may be incorrect). It doesn’t seem to be some sort of racism at work, at least not intentionally, because, as you so correctly pointed out, the people creating these programs need to get their data as broad and as accurate as possible. Therefore, there seems to be a need for, in addition to better programming, better outreach to potential volunteers, both minority and majority.
LikeLiked by 1 person