AI Recognition of Sexuality??

The latter half of this week has seen a lot of media reporting on a new study into AI recognising sexuality from face pictures:

Obviously, all sorts of concerns are raised here, some of which are even noticed by the authors and the media: the bi-erasure, the exclusion of any People of Colour, the ethics of using photos the way they have, the use of this type of technology. But there are some issues I haven't seen discussed.

I'm going to start with the "headline claim". This is the claim I'm seeing in headlines and in the body of the reports: that the machine can correctly identify your sexuality from your photo 81% of the time for men and 71% of the time for women (compared to 61% and 54% for human guessers). In the Article's abstract, it is described as:

Given a single facial image, a classifier could correctly distinguish between gay and heterosexual men in 81% of cases, and in 71% of cases for women. Human judges achieved much lower accuracy: 61% for men and 54% for women.

The first problem is that this is plain wrong. At the very least it is horrifically misleading. The abstract extract there is misleading, and the media have gone with that misdirect:

The Guardian - The study from Stanford University – which found that a computer algorithm could correctly distinguish between gay and straight men 81% of the time, and 74% for women

Pink News - Incredibly, the model accurately predicted the person’s sexuality in 81% of cases.

To be fair to The Economist, they did describe it right:

The Economist - When shown one photo each of a gay and straight man, both chosen at random, the model distinguished between them correctly 81% of the time.

Yep, when you read the article it becomes clear that the result they're reporting is not taking a single image and predicting gay or straight, they're taking two images, one of which is from a set of straight pictures and the other from a set of gay pictures and saying which has the higher chance of being gay. They get that right 81% of the time (for white men). The "single image" bit in the abstract actually refers to two single images rather the two sets of five images (where the probability of correctly identifying the "most gay" jumps to 91% for white men). Most of the coverage I've seen has been inaccurate on this.

Some things this paper is not doing:

  • taking a single image and pronouncing whether the image shows a gay or straight person
  • Providing a "threshold" figure above which a person can be assumed to be gay with certain confidence

However, this paper is also a shitshow of bollocks in other methodological ways.

So, for the first batch of "studies", the researchers put together a data set of "public" photos taken from a US-based dating website. They claim that they selected:

  • white people
  • aged 18-40

They categorised them as "gay" or "straight" based on the gender their profile said they were seeking.

So, how many problems can you spot here?

Only selecting out gay people
So, yeah,the first thing to note is that this selection is limited to those willing to be out on this particular dating website(s).
There are some "Authors' Notes" addressing some criticisms and this is what it says about this: "We could not think of an ethically sound approach to collecting a large number of facial images of non-openly gay people." - this is not really an answer,
it's an excuse. The rest of the response actually isn't relevant to the criticism (it talks about what they did to make sure it was the face).
Only using one dating site
They state that they used "a dating website" but do not give any further information or methodology. It is unclear whether they mean one dating site or a network of dating sites. Why is this important? Well, let's say you choose 100 photos from TwinksTwinksTwinks and 100 photos from AllTheBears are you going to see the same thing? Which is more representative of Gay People{TM}? Without knowing the sites and/or methodology here we do not know what make up of the training sample is. Given that the paper specifically mentions facial hair as a feature used in the final algorithm (gay men have less: whether it is nature or nurture... or a mixture is an actual discussion in the paper), I think it is important to know what lengths they went to to produce an accurate sample of Gay Men for this study. In fact, do we even have statistics about "tribe" membership? Or do we have to add another limitation: White, 18-40, US-based, Out, Twink? (The Author's response to being asked about the site on Twitter was: "Come on, do you really want me to share this information on Twitter?"
Not considering whether dating profiles lie
Oh, actually, they do. In a discussion at the end they do add a sentence: "However, we believe that people voluntarily seeking partners on the dating website have little incentive to misrepresent their sexual orientation". Basically, they can't imagine why anyone would lie on a dating profile: a dating profile that friends and family they are not out to might see; a dating profile which might only represent one part of their sexuality or gender because shitty dating sites are tied to a gender (or sexuality) binary. Other reasons for lying on a dating profile can be sent to Stanford University...

They do another study within the paper to look at whether the same differences can be found in non-dating site photos. For this they selected a load of photos of, what they considered, gay men from Facebook. In order to win this particular lottery, you had to:

  • be white,
  • male,
  • 18-40
  • have your "interested in" field on your Facebook Profile set to "Men", and
  • like at least two pages from the "50 Facebook pages most popular among gay men" (the examples given are: "I love being gay", "Manhunt", "Gay and Fabulous" and "Gay Times Magazine").

That last bullet point in particular is rather irksome. In the UK, the Home Office is responsible for assessing asylum claims and we regularly hear horror stories of decisions made by the department that someone isn't gay because of utter shite like not subscribing to gay media. Seeing this attitude replicated here is, frankly, disgusting. I suggest looking at the recent report from Stonewall and UKLGIG - particularly Chapter 6.

However, this bit of discussion does feature, possibly, my favourite line of the whole article:

Unfortunately, we were not able to reliably identify heterosexual Facebook users.

Go on, let that sink in. Wallow in it...

More seriously, this is what hetronormativity means. It means that Facebook Pages are not labelled as "straight", that's the default. "Gay" pages are found, labelled, corralled into Straight pages just are.

This study found that Facebook photos were less easy to correctly guess (74% rather than 81%) and not distinguishable from Gay photos from dating sites.

I mentioned earlier that the paper doesn't look at a "threshold" score for a photo to be classified as gay. They do discuss this somewhat. They took a sample of 1000 photos (930 straight, 70 gay - 7%) and ranked them according to the score. Looking at the top 1%, (10 photos), 9 were from the gay set and 1 was from the straight set. The top 3% gave 23 gay and 7 straight - this means that if you were to choose this score as a threshold, you'd get a 23% false-positive rate and miss 67% of your "targets". That doesn't seem a very good test to me.

The paper also totally ignores bisexuals. The only mention of bisexuality is acknowledging that some profiles might actually be of bi people - but this is immediately before they dismiss the idea that anyone could ever possibly want to lie on their dating profile.

Of course, they believe that the findings are applicable on a wider basis than they have studied because: hey, why not. They believe the Prenatal Hormone Theory for development of sexuality.

The claimed reason for doing this research is to identify how accurate this type of technology would be if it were to be used by states etc. on their population. They also cite concerns about just how available we make our selves online and that this exposure may be eroding our privacy in unexpected ways - as humans are bad at judging based on facial features but machines are supposedly much better. Unfortunately, the media is misreporting what the study does and the author is not doing a whole lot to correct the misrepresentation as far as I can see. Instead, the media is pushing the idea of 81% accuracy (91% with 5 photos) of identifying gay men from facial features. There is some mention in most reporting of the bi and white limitations, but nothing of the problems with the dataset used to train the AI or the assumptions used.

The - this means that it is being made available prior to being formally published in a journal. It has, apparently, been peer-reviewed for that publication. The authors are a "CS at Stanford" (Wang, helped with the analysis) and "Professor at Stanford University Graduate School of Business. Computational Psychologist and Big Data Scientist" (Kosinski - wrote the report). Neither of these appear to be anywhere close to Gender Studies or Queer studies and I would strongly urge them to speak to people in those departments before doing anything of this nature again - particularly about how to identify and categorise queer people and obtain representative samples.

Conversations on Twitter also seem to suggest the paper originally was stronger on privacy and the links to Prenatal Hormone Theory were added later. Kosinski still appears to feel the main point of the article is the privacy worries, unfortunately, I think he's fucked up that by linking to PHT.