Against the “extreme male brain” model of autism

Simon Baron-Cohen has been pushing the extreme male brain theory of autism for a while. It asserts:

[…] ‘Empathising’is the drive to identify another person’s emotions and thoughts, and to respond to these with
an appropriate emotion. […] ‘Systemising’is the drive to analyse the variables in a system, to derive the underlying rules that govern the behaviour of a system. […]


I will be arguing that systemising and empathising are two key dimensions in defining the male and female brain. […]

[…] According to the ‘extreme male brain’ theory of autism, people with autism or AS should always fall in the [extreme systematizing range]. […]


So in other words, it is the proposal that systematizing, maleness, and autism are near-identical, and that empathising, femaleness, and non-autism are near-identical.

The main problem with this theory is that it is obviously empirically disproven. For instance, in one study that is often cited to support it, Testing the Empathizing-Systemizing theory of sex differences and the Extreme Male Brain theory of autism in half a million people, they get the result:

And sure – there was a difference between the groups in the direction predicted by the theory. But look at the magnitude of the difference. It’s nowhere near as big as the theory claims; there’s tons and tons of overlap between the groups.

We can also take a look at table 3 in the study. The extreme male brain theory predicts that autistic people should always fall in the “extreme systematizing” area, yet only ~10% of autistic people end up there, with almost half of all autistic people ending up outside of the systematizing-skewed region.

In conclusion, E-S EMB theory of autism does not add up.

EMB Motte-Bailey

If it were any other theory, the effect sizes might be considered respectable. There’s no rule of science that an effect has to fully separate the groups under investigation to be relevant. Some discussions of science get hijacked by identity politics, where people refuse to acknowledge an effect, just because it’s not “big enough” according to their subjective judgement.

However, EMB isn’t just the theory that autistic people are more prone to systematizing than non-autistic people. Rather, it is a theory that this empathisizing-systematizing shift is the core defining feature of autism. That’s a perfectly valid theory to have, but it makes it necessary for them to be highly correlated to be true. After all, if autism is the same thing as a systematizing skew, then how can people be autistic without having this skew, or be non-autistic while having the skew?

Sometimes EMB proponents say that this isn’t really what the EMB theory says. Instead, they make up some weaker predictions, that the theory merely asserts differences “on average”. This seems like a motte-bailey strategy; they want to talk big about how empathizing-systematizing is the explanation for autism, but they don’t want to actually commit to the theory (because it is wrong). If the EMB theory had instead been named the “sometimes autistic people are kinda nerdy” theory, then it would be a lot more justified by the evidence – but also not look nearly as deep or insightful, which is presumably why it wasn’t named as such.

The Blanchardian fallacy

The Blanchardian fallacy is the assumption that humans vary across exactly one binary axis, being either autogynephilic or homosexual transsexuals. Obviously I am being somewhat cheeky here – nobody really believes this. But while nobody believes this, Blanchardians do seem to have a tendency to assume that things are linked to autogynephilia when this isn’t really justified.

To give some examples – one person once asked me, is there a connection between being philosophically oriented and being autogynephilic? Anecdotally, a lot of autogynephilic transsexuals seem to be very philosophical, so could it be…? Nah (at least if the personality data I’ve collected is right); and actually it is inappropriate to generalize from autogynephilic transsexuals to autogynephiles in general, because autogynephilic transsexuals can differ from ordinary men due to other factors than autogynephilia.

Is there a link between autogynephilia and dissociation/optical illusions? Not according to my data, despite there seemingly being a link between transsexuality and dissociation/optical illusions. The assumption that there must be is the Blanchardian fallacy again. How about autogynephilia/nerdiness? Anecdotally a lot of trans women seem nerdy. It’s hard to say for sure due to potential collider bias, but so far I haven’t seen support for links. (Recently I’ve been wondering if it might be an artifact of women’s dating preferences – the same sorts of women who are attracted to nerdy men are also attracted more likely to be attracted to MtF transsexuals, so it would make sense that nerdy males would be more likely to transition, as it shrinks their dating pool less. But this is speculative. And this is itself assuming that transsexuality is linked with nerdiness; maybe it is not and my anecdotes are misleading.)

Some people argue that the trans activists that attacked Michael Bailey are narcissistic, and take this as an indication that autogynephiles are narcissistic. But narcissism can vary independently of autogynephilia (and indeed it doesn’t appear to be correlated with autogynephilia).

I’ve been guilty of the Blanchardian fallacy myself too. My impression is that the notion of AGPTS/HSTS split makes it very easy to naively seek out correlations and inappropriately generalize them. In recognizing the Blanchardian fallacy, I’ve started becoming very cautious about what sorts of inferences I make. One of the most important aspects of this is rigorous distinctions between autogynephilic transsexuals (AGPTSs) and autogynephiles in general (AGPs). There may be many factors that lead to transition beyond autogynephilia, and which end up distinguishing AGPTSs from AGPs.

Another thing that is important is to be hyper-aware of what sorts of selection biases you face. In learning about autogynephiles, you will encounter information about various autogynephiles that exist. But this information will be filtered through various processes, and depending on the process you can end up with arbitrarily skewed ideas about what autogynephiles are like.

Most likely, a similar post could be made that focuses on HSTS – but I do not have as many examples in mind there.

Investigating the effect of stress on gender dysphoria

Some people report that they feel stress contributes to their gender dysphoria, being more gender dysphoric when they are more stressed. I’ve been skeptical of this, but Pasha did a survey where he found that a lot felt it applied to them:

Pasha’s takeaway from this seemed to be to take it at face value; some felt more gender dysphoric due to stress, while others did not. However, even with this survey, I was still pretty skeptical. Causal inference is hard, and it doesn’t seem super logical that stress would make you dysphoric. Couldn’t it just be that there was some sort of confounding, perhaps with the being a specific kind of stressful context where one is more dysphoric?

But then I got an idea. Stress levels vary a lot over time due to knowable exogenous factors. For instance, social tension is pretty stressful. Thus, if we investigated the gender dysphoria associated with those exogenous factors, we could perhaps untangle it from this.

If this was to be done properly, then it would probably involve some sort of experience sampling method. But that’s very invasive and a lot of work, so I hacked it by describing 7 stressful and 7 non-stressful situations1, and asking people to say how gender dysphoric they would feel during those situations. I posted this survey to /r/Blanchardianism, /r/TGandSissyRecovery, and /r/detrans and got 54 responses.


Before we go into the results regarding reactions to situations, it is worth first looking into whether I replicated Pasha’s result of respondents feeling that stress contributes to gender dysphoria. I had three questions asking about this, which I analyzed with a latent class model in order to summarize the responses. 40% of my respondents felt that they were more gender dysphoric when stressed. Of these 40%, half felt less gender dysphoric when relaxing, while the other half felt that it was complicated. 20% of the respondents reported that they generally felt no gender dysphoria, while 30% of respondents reported that they felt equally gender dysphoric regardless of stress, and 10% reported that they felt more gender dysphoric when relaxing. I will get back to the results by different subgroups later in the post, but first let’s consider the overall average results in the survey.

Here is a scatterplot with the different situations, as well as how stressful and gender dysphoria inducing they were perceived to be:

All of the stressful situations were on average perceived to be more stressful than all of the relaxing situations. Further, overall the ranking in terms of stressfulness seems to make a lot of sense; situations that are intuitively more stressful are also quantitatively placed in more stressful spots in this plot. However, there was no correlation between stressfulness and gender dysphoria. Case closed, there’s no contribution of stress to gender dysphoria, end of investigation?

Method flaw: focus

Definitely not. I had multiple spots for comments in the survey, and some respondents pointed out some things that may be problematic for the previous investigation:

If being distracted alleviates someone’s dysphoria, it follows that being stressed could do the opposite.

When I have a urgent problem that needs focus to be resolved, I tend to not think about my body and dysphoria related things in that moment, but I still feel that I want to be male. I just have less focus on dysphoric feelings.

Time to think when lying in bed can open up opportunities for thoughts to wonder to stressful topics

If I’m walking on my own I’ll be left with my thoughts so it comes up.

That is, a topic that came up several times was a feeling that situations that require more focus can help distract from the dysphoria, and therefore temporarily reduce distress. This… actually seems like a really compelling point? Just eyeing the scatterplot, it vaguely seems like focus could account for it.

In order to investigate this, I asked Pasha to collect ratings from /r/SampleSize on how much focus each of the situations require, so that this information could be added to the model. The ranking of focus requirements, from most to least, was as follows:

  • High-stakes situations
  • Struggling with duties
  • Hanging out with friends
  • Social tension, loud/noisy places, and doing something creative
  • Spiders and other creep
  • An obstacle blocking your task
  • Reading news about problems in the world
  • Eating food
  • Listening to music
  • Taking a walk
  • Going to bed on a Saturday evening
  • Knowing that all of your chores are completed

This seems like a pretty reasonable ranking to me. So what happened when I used both focus requirement and stress to predict dysphoria? Nothing; an R^2 of 0.0026. To get an overview of what is going on, let’s take a look at a scatterplot, with the amount of gender dysphoria interpolated2 between the different situations:

Scatterplot showing situations placed according to their stressfulness and focus reguirements. In order to emphasize the overall structure, I colored regions according to the amount of gender dysphoria associated with situations in that region, using machine learning to interpolate. Red = more gender dysphoric than average, blue = less gender dysphoric than average.

This looks like… pure noise probably. Given that the different scenarios vary quite a lot in how dysphoria-inducing they are, it seems like there must be something that can explain it. But stress + focus requirement does not seem to be it.

Effect of situation on dysphoria is consistent across groups?

Here’s something to consider: some participants felt that stress contributed to gender dysphoria, while others didn’t. This raises the question that perhaps the ambiguous results are just due to individual differences in how different situations relate to gender dysphoria.

To investigate this, I investigated things by subgroup. It turns out, the different subgroups have very high agreement about what situations are gender dysphoria inducing:

The main disagreement seems to be high-stakes situation and struggling with duties, where those who feel that stress contribute to dysphoria feel more dysphoric, and news about problems, where those who feel that stress doesn’t contribute to dysphoria feel more dysphoric. Speculatively, I’d guess that it is perhaps more a question of whether not living up to responsibilities causes dysphoria? Or maybe related to self-esteem rather than stress? Not sure.

But to me, a much more noteworthy observation is that there appear to be large situational differences in how gender dysphoric people feel, and that these situational differences are basically agreed on by people who otherwise seem to interpret the factors in their dysphoria very differently. To me, this suggests that it may be fruitful to scale up these investigations, by collecting data on the relationship between dysphoria and a broader range of situations, as well as taking a much larger number of ways that the dimensions differ into account.

If any readers want to suggest situations or situational factors that should be taken into account in further research on this, I would encourage you to do so in comments or through any other method.

If you want the dataset, send me a message or join the discord. There’s more that can be done with this dataset, and I hope to eventually get around to some further investigations to blog about later, but the post must end at some point, and this is that point.

1. The descriptions given in the blog post above are brief titles rather than full descriptions. The full descriptions in the survey can be seen below:

  • Obstacle blocking your task: When there is an obstacle blocking your task (e.g. you need to use the sink but someone has left dishes in the sink; or similar), to what degree do you feel distressed about being male, wishing instead to be female?
  • Going to bed on a Saturday evening: When you go to bed on a Saturday evening, to what degree do you feel distressed about being male, wishing instead to be female?
  • Social tension: When two or more other people around you have a conflict or otherwise social tension, to what degree do you feel distressed about being male, wishing instead to be female?
  • Eating food: When you are eating food, to what degree do you feel distressed about being male, wishing instead to be female?
  • Loud/noisy places: When you are in loud/noisy places, to what degree do you feel distressed about being male, wishing instead to be female?
  • Hanging out with friends: When you are hanging out with friends, to what degree do you feel distressed about being male, wishing instead to be female?
  • Struggling with duties: When you have some task that someone is expecting you to handle (such as a task at work that your boss expects you to deal with), but you are struggling with it and are about to meet with the person who has given you the task, to what degree do you feel distressed about being male, wishing instead to be female?
  • Taking a walk: When you go for a walk (not to get to some specific place, just for leisure/exercise), to what degree do you feel distressed about being male, wishing instead to be female?
  • Spiders and other creep: When there’s a spider, a moth, or some other creep in your room, to what degree do you feel distressed about being male, wishing instead to be female?
  • Knowing that all of your chores are completed: When you have completed all of your chores and are free for the immediate future, to what degree do you feel distressed about being male, wishing instead to be female?
  • High-stakes situations: When you have to deal with high-stakes situations, such as a test or needing to make a good impression to someone you only meet briefly but who has a big effect on your future, to what degree do you feel distressed about being male, wishing instead to be female?
  • Listening to music: When you listen to music, to what degree do you feel distressed about being male, wishing instead to be female?
  • Reading news about problems in the world: When you get news about problems or potential problems in the world (e.g. war, pandemics, economic trouble, political scandals, etc.) to what degree do you feel distressed about being male, wishing instead to be female?
  • Doing something creative: When you do something creative, to what degree do you feel distressed about being male, wishing instead to be female?

2. More specifically, I used a support vector machine for regression with a radial basis function. I wouldn’t make too much of the interpolation, it’s just to get a big picture overview of what’s going on.

Contra James Cantor on desistance

In 2016, James Cantor wrote a blog post about desistance of gender dysphoria, arguing:

Despite the differences in country, culture, decade, and follow-up length and method, all the studies have come to a remarkably similar conclusion: Only very few trans- kids still want to transition by the time they are adults. Instead, they generally turn out to be regular gay or lesbian folks.

However, while this is an accurate summary of the results in males, it is not an accurate summary of the results in females. Out of the 11 studies he cited, 7 studies dealt with males only, 1 study had only a single female participant, and only 3 studies had multiple female participants. Among the desisters, the studies with females got the following results:

Thus, looking over the studies that Cantor cited, there were 41 female desisters, of which 28-31 appear to be heterosexual, 3-5 appear to be bisexual, and 2 appear to be lesbian. This does not seem to match the idea that they generally turn out to be regular lesbian folks; rather they seem to generally turn out to be heterosexual.

Across these studies, there were no persisters in the first study, 9 persisters in the second study, 3 persisters in the third study, and 24 persisters in the fourth study, for a total of 36. In the latter three studies, the numbers of persisters who were exclusively gynephilic appear to be 7, 2, 13-14 respectively, for a total of perhaps 23. (It is worth noting that while 23 is much smaller than 36, most of this gap is due to participants with unknown sexual orientation.)

Overall, persistence among natal females appears to be near-perfectly correlated with sexual orientation. What to make of this correlation is unclear2, but certainly it seems to make Cantor’s characterization inaccurate.

1. It’s a bit unclear to me from the study text, but it seems like possibly this study may have sample overlap with the other dutch study. The “psychosexual outcome” paper got its sample from the clinic between 1989 and 2005, while this study got its sample from the same clinic between 2000 and 2008. This doesn’t seem to change the substantive conclusion of this post, but it may be worth keeping in mind.

2. There seem to be two general approaches; either sexual orientation gets assumed to affect desistance, or desistance gets assumed to affect sexual orientation. For instance, I’d be inclined to think that for someone to desist, some sort of factor must change that makes it more advantageous to live like one’s assigned sex, compared to transitioning. A heterosexual orientation surfacing might be such a factor. But that’s speculative. Another thing that one could claim is that transitioning somehow influences orientation. Alternatively, one might believe that there is misreporting in sexual orientation. Point is, there’s a lot of possibilities here.

Quick heads up: Julia Serano believes in ETLE too

Autogynephilia is a sexual interest in being a woman. Erotic target location error is a theory which asserts that this sexual interest is connected to gynephilia (a sexual interest in women as partners); that there is ???something??? which usually prevents men’s gynephilia from finding the thought of being a woman to be erotic, but that this ???something??? is missing in autogynephiles.

Some people say that they find ETLE theory absurd, mock it, and call it debunked. And then they endorse people like Julia Serano, who claim to critique it. But here’s Julia Serano’s critique of it:

A third factor that may influence embodiment fantasies is sexual orientation, albeit not in the way that Blanchard envisioned. Specifically, if an individual is attracted to femaleness and femininity in a more general sense (e.g. they find such qualities erotic in their partners), then these same attributes might also be sexually salient with regard to their own embodiment, leading to more frequent or intense FEFs. (A similar correlation between attraction to maleness and masculinity, and MEFs, might also be expected.) Or to phrase this conversely: If an individual is not attracted to female or feminine attributes more generally, then they may be less likely to find FEFs arousing or compelling. This fairly simple explanation (which Blanchard never explored) is consistent with the correlations researchers have found between sexual orientation and embodiment fantasies, but without invoking direct causality.

Julia Serano, Autogynephilia: A scientific review, feminist analysis, and alternative ‘embodiment fantasies’ model

But this is literally just erotic target errors restated! Serano’s argument is that maybe there’s a link between autogynephilia and gynephilia where whichever mechanism that creates gynephilia also for some reason sometimes creates autogynephilia, which is precisely same as Blanchard’s postulation about the same.

It may be worth quoting Blanchard and Freund to illustrate the similarities in the theories:

What kind of defect in a male’s capacity for sexual learning could produce anatomic autogynephilia, transvestism, and fetishism, singly and in various combinations? Common to all these phenomena is a kind of error in locating heterosexual targets in the environment. In fetishism, the individual orients toward a particular garment (e.g., panties, brassieres) rather than those parts of the female body the garment usually covers. In transvestism, the individual is aroused by the appearance of an attractively clad woman, but he locates this image on himself rather than another person. In anatomic autogynephilia, the individual is oriented toward the characteristic features of the feminine physique (e.g., breasts), but he attempts, in some way, to locate these features on his own body.

The above analysis suggests the failure of some developmental process that, in normal males, keeps heterosexual learning “on track,” perhaps by biasing erotic response toward external rather than internal stimuli, and inherent rather than variable features of the female appearance. This putative defect allows the development of various misdirected – but still recognizably heterosexual – behaviors, and makes it possible, if not probable, that more than one misplaced interest will appear in the same individual.

Ray Blanchard, Clinical observations and systematic studies of autogynephilia

So in other words, Blanchard’s ETLE theory is that for some reason the gynephilia is applied to one’s own embodiment, as Serano describes. Or in other words, both agree that some sort of gynephilic eroticism contributes to autogynephilia, and so both agree on ETLE.

Julia Serano is not the only one I’ve seen who has done this; e.g. I’ve seen someone else propose that ETLEs don’t exist and any correlation between autogynephilia and gynephilia is just because gynephilia makes it easier to sexualize having a female body… which of course is the core claim of ETLE theory, making it puzzling that someone might call that a contradiction of ETLE.

Examining the structure of male sexual interests

Sexuality keeps coming up in this Blanchardian sphere of gender research, and so it would be nice to have an overview of how it works. Fortunately, Pasha, the creator of the /r/AskAGP subreddit, recently did a HUGE survey on /r/SampleSize, where ~1000 people responded to 84 different sexual fantasy items2. Since the structure of male and female sexuality seems to differ, in this post I will focus on the responses from the 494 cisgender men who responded.

A good starting point for understanding a domain of variables is factor analysis1. Factor analysis tries to model the data using a lower number of “factors” which group together the variables that are highly correlated, thereby abstracting the data and revealing large-scale structures. I can then inspect the fantasies that it lumps together, and name the factors to summarize the results in a readable way. Here are the results for applying factor analysis using 1, 2, 3, 4, 5, and 8 factors:

Bass-ackwards factor analysis applied to the male sexuality data. Each level represents a factor analysis. At the first level, I extracted one factor; at the second level, I extracted two, and so on. The boxes at the bottom show some of the items that were assigned to each factor. The arrows between the levels show how the factors correlate between the different factor analyses.

At the final level, there were eight factors; which I tend to think of in two groups, four “broad” factors that I assume shape everything else, and four “narrow” factors that I assume are less important for sexuality in general (though they may be very important in specific contexts). I picked names for the broad factors using the following reasoning:

  • One very consistent factor was characterized by a very large number of generic partnered sex acts. This seems core to the definition of allosexuality (sexual attraction to other people), so therefore I labelled it Allosexuality.
  • The second most consistent factor was primarily characterized by androgynous men. This made it seem like it denoted attraction to feminine men. However, it was also heavily characterized by masculine men, and it was negatively characterized by women (i.e. those who scored higher in this factor were less attracted to women). The feminine men were also secondarily placed on a different factor relating to androgyny, so therefore I decided to label this factor Homosexuality.
  • A third very stable factor contained a variety of items relating to wild sex with many strangers. This seemed reminiscent of what social scientists call sociosexuality (essentially meaning promiscuity or “sluttiness”), so therefore I labelled the factor Sociosexuality.
  • Fourth, all the way there appeared to be a factor that contained a variety of peculiar sexual interests that did not seem to be particularly defined or characterized by any common theme. My assumption is that this factor reflects the General Factor Of Paraphilia, so therefore I labelled it Paraphilia.

I picked names for the narrow factors using the following reasoning:

  • One factor involved oneself being the opposite sex, often combined with various sex acts. On this blog, it is well-known that this represents Autogynephilia, a sexual interest in being a woman.
  • A related factor involved attraction to masculine women. It also to an extent involved attraction to feminine women, but my suspicion is that this is due to the factor analysis getting confused by bisexuals. Furthermore, androgynous men seemed to have a secondary loading on this factor. Therefore I emphasized the Androgyny part more than for the Homosexuality factor, and named it Androgyny/Gynephilia.
  • A well-known factor that popped up involved bondage, discipline, dominance, submission, sadism and masochism. Therefore I named it BDSM.
  • Finally, another factor that came up involved things with zoophilic and pedophilic themes, as well as themes involving bodily waste. Furthermore, my experience with looking at other survey data on sexuality makes me have some suspicions about what would also have been included if the items had been there, and that makes me label the factor Disgust/Taboo.

These generally seem like some reasonably interpretable factors. Furthermore, while it was difficult to fit into a diagram, it appeared I could coherently continue the factor analysis further, to 15 factors. An image can be seen here, but to summarize it split as follows:

  • Allosexuality remained as it was before
  • Homosexuality remained as it was before
  • The Disgust/Taboo factor split into three parts; a relatively pure Pedophilia/Ageplay factor, a Bodily Waste factor, and a Nonhuman Anthropomorphic factor.
  • The BDSM factor split into four parts; two relatively pure factors involving Submission/Masochism and Dominance/Sadism, plus the Bodily Waste factor, plus something that appeared to be a Fetishism factor (involving latex and leather)
  • The Paraphilia factor continued with a difficult to interpret generic Paraphilia factor, but it also spun off three other factors, namely a Fetishism factor, a Roleplay factor, and a Transvestism factor.
  • The Sociosexuality factor continued with a general Sociosexuality factor, but seemed to spin off a Roleplay factor as well as a factor that appeared to involve bimbos or body modifications.
  • The Androgyny/Gynephilia factor continued into an Androgyny/Gynephilia factor in the final layer, but it also appeared to reduce interest in a factor seemingly related to Transvestism.
  • The Autogynephilia seemed to split into an anatomic Autogynephilia factor and a Transvestism factor.

The full factor analysis can be seen here.

Towards a new general factor of paraphilia (GFP) measure

I have an idea for how to empirically prove that autogynephilia causes gender issues, but in order for the idea to work, I need some variable that influences autogynephilia. This study of paraphilias gives a good candidate for it. To see why, let’s recap some principles.

Almost all paraphilias are positively correlated with each other. This indicates that they have some sort of common underlying causes. If we lump all of these causes together into a single variable, then this variable is usually labelled the General Factor of Paraphilia. Of course, this is not very useful unless we can actually measure the variable. But fortunately, there’s an easy way of measuring it: simply measure a broad variety of the narrower paraphilias, and average them together. Causes that are specific to individual paraphilias will then disappear in the averaging, while causes that are common to all paraphilias will add up.

Because of the rich factor structure discovered in the previous section, it is important to sample paraphilias from a broad variety of factors, so that we don’t just end up measuring the narrower factors. Studies I’ve seen of the topic often fail to do this, and instead seem to sample mainly from the BDSM and Disgust/Taboo factors. In theory, this reduces the accuracy of their general factor measure.

To attempt to do better, I sampled paraphilias from a broad variety of factors. The correlation matrix can be seen here:

Correlation matrix between a broad range of paraphilic interests.

As described in previous posts, I then extracted the general factor of paraphilia, and looked at the correlations that remained after controlling for it. This yielded the following matrix:

Correlations after controlling for the general factor of paraphilia.

To shorten the list and obtain a purer measure, I then removed a number of items due to them being too correlated with other items on the list:

  • Anal penetration was too correlated with bondage, and so I removed anal penetration.
  • Attraction to women wearing men’s clothes was too correlated with autogynephilia. If I was optimizing purely for a measure of GFP, I would remove autogynephilia, but since the idea is to use this in conjunction with autogynephilia measures, I instead removed attraction to women wearing men’s clothes to avoid getting at anything too specific to this.
  • Having a cigarette-smoking partner was not very correlated with the GFP, but was relatively correlated with humiliation masochism, and so therefore I removed it.
  • Nyotaimori was too correlated with doctor roleplay, and therefore I removed it.
  • Flashing was not very correlated with the GFP, but was a bit too correlated with other variables for my liking, especially since it is a courtship disorder and therefore might be controversial to ask about in a survey. Therefore I removed it.
  • Bearded women were not very correlated with the GFP, but they were correlated with attraction to nippleless partners and to statues, so therefore to avoid introducing extra noise, I removed them from the list.
  • Similarly, balloons were mostly uncorrelated with the general factor of paraphilia, but were too correlated with latex and were therefore removed.
  • Getting peed on was correlated with humiliation masochism and therefore removed.
  • Making your partner adhere to a diet was mostly independent of the GFP, and was vaguely correlated to a variety of other things, and so was removed.
  • It might be worthwhile to investigate having sex with a religious figure. I removed it at this step because it was correlated with interest in sex with older partners, but as you will see, that item got removed at a later step in this test construction, and therefore this could be revisited.
  • Cat ears were removed due to it correlating with a variety of other items.

This yielded the following items:

New selection of GFP items.

To further evaluate the items, it seemed appropriate to analyze them together with Allosexuality and Sociosexuality items, to ensure that they interact well. Upon doing so, some problems popped up:

Correlations between paraphilia items, sociosexuality items, and allosexuality items.

The bondage item appeared to be strongly associated with sociosexuality and allosexuality. The item about sex with older partners appeared to be strongly associated with sociosexuality. And the item about sucking on your partner’s tongue was strongly associated with allosexuality. Therefore, to achieve a cleaner paraphilia measure, these items were removed. I fit a confirmatory factor model to a reduced set of items, and it seemed to achieve a not-too-terrible fit. Thus the final set of items are:

  • Imagining being a member of the opposite sex
  • Having your partner call you slurs or insults
  • Imagining having sex with a vampire
  • Having your partner wear latex
  • Having a sexual partner with no nipples (blank skin where the nipples would be)
  • Rubbing your genitals on a piece of furniture
  • Having surgery to modify your body to be more erotic to your partner
  • Touching a naked statue
  • Pretending that you are a patient and your partner is your doctor as sexual role play

These are my current best attempt to make a brief general factor of paraphilia measure. For the psychometrically inclined, it has an alpha of 0.67, which is not so good, and indicates that the measure could use improvement. However, to me it seems like a reasonable starting point to work from.

Who are the paraphiles?

It might be nice to get some idea of how paraphilias relate to other variables. Let’s start with other sexual interests. It is commonly claimed by Blanchardians that different sexual interests compete, so that if one is more into one thing, then one becomes less into other things. I found no trace of this in the sexuality survey, with the general factor of paraphilia instead being highly correlated with allosexuality all across the spectrum:

Essentially it was rare for participants to be paraphilic without being allosexual. The only form of paraphilia that I found evidence for being negatively associated with allosexuality was the disgust/taboo cluster of paraphilias.

Most likely, the correlation here is underestimated due to sampling effects; since this was a survey with a huge number of sexual fantasies, there wasn’t much reason for asexual or low-libido people to participate, and so they may end up undersampled. On the other hand, reddit has much higher rates of paraphilias than the general population, and this may lead to a higher correlation, due to there being more variance to examine.

I also found paraphilias to be even more correlated with sociosexuality. It might be entertaining to think about whether sociosexuality should be considered to be a paraphilia; it seemed like there were some paraphilias that it ended up closer to than it did to allosexuality.

I also decided to look at some group membership. I’ve heard some anecdotes and seen some studies to suggest that autism might be associated with paraphilias. However, when looking into it, I didn’t find much effect:

Shifts in paraphilic and other sexual interests for autistic men. The left three variables are the general factor of paraphilia, allosexuality, and sociosexuality (measured in standard deviation units), while the variables on the right are the specific paraphilias used to estimate the GFP (measured in absolute units). Black bars represent standard errors in the estimate. The numbers in the title refer to the sample size for autistic vs non-autistic men.

If anything, the main thing characterizing autistic men is that they were much less allosexual than non-autistic men. My hunch is that this is the key; being less allosexual, the proportion of paraphilic to normophilic activities they engage in will be paraphilic-skewed.

Another group of interest would be polyamorous men:

Shift in paraphilic and other sexual interests for polyamorous men.

As can be seen, they are much more sociosexual, but also much more paraphilic, than monogamous men. This matches previous observations that I have seen about polyamorous people having a kink for their partner having sex with someone else.

I’ve seen some people suppose that homosexuality is a paraphilia. However, this doesn’t really seem to be so; or at least, gay men don’t seem all that particularly paraphilic:

Shift in paraphilic and other sexual interests for gay men.

Bisexual men, on the other hand, seem to be more paraphilic:

Shift in paraphilic and other sexual intersts for bisexual men.

This matches a hunch I’ve had for a while that bisexuality and homosexuality are more orthogonally related than continuously related. That is, I suspect that bisexuality results from a great level of sexual flexibility, or something like that.

We were also interested in the relationship between paraphilias and intelligence. Anecdotally, there seems to be a correlation between the two, with many of the communities that are highly paraphilic being known to also be highly intelligent. We had two measures of intelligence; first, we had asked people if they had ever taken an IQ test, and if so, what their score was; and secondly, we asked if they had taken the SAT, and if so, what their score was. The score was asked in broad buckets, with IQ being scored in buckets of 10 and SAT being scored in buckets of 100. Of the people who reported scores, most reported far above average, so YMMV if you believe that reddit is full of geniuses. But if you do believe the data, then I can say that there was moderate correlation between the two cognitive scores, at r~0.36. To get an overall cognitive score, I averaged them together.

Scatterplot containing intelligence and paraphilias. To reduce the degree to which points overlap due to low measurement fidelity, I did some slight reweighting before taking the averages so they would be more noisy, but there is probably still overlap.

There was no correlation between intelligence and paraphilias, r~-0.02. So there goes that theory.

Attraction to androgyny

A final thing to investigate is the structure of attraction to androgyny. In surveys I often find I want to ask about attraction to androgynous people, but I don’t know what dimensions exactly to include. On my request, this survey included a bunch of androgynous archetypes, and so I can factor-analyze them:

  • a woman who has a full beard and a lot of body hair
  • an otherwise feminine woman who is mainly into penetrating you using a strapon
  • an assertive, muscular woman with masculine interests (a tomboy)
  • an ambitious career-focused woman who has a high position in a business job
  • an “Amazonian” woman; a woman who is taller and stronger than you are
  • a woman who exclusively wears masculine clothes, has short hair, is socially dominant, coarse, and has masculine interests
  • a nerdy woman who is awkward and not very interested in people
  • a woman who has small breasts and narrow hips
  • a man who has very effeminate, “campy” mannerisms and speech (but who still presents masculine)
  • a physically androgynous man who often wears women’s clothes (a femboy)
  • a very short, narrow-shouldered man with a soft face
  • a sweet/caring unambitious man who wants to be a househusband and start a family
  • an otherwise masculine man who is mainly into being anally penetrated by you
  • a sensitive/emotional artistic man, who is physically slender and tends to daydream
  • a physically masculine man who finds it hot to wear women’s clothes during sex
  • a pre-operative passing trans woman (MtF, a feminine-looking woman with a penis)
  • a pre-operative passing trans man (FtM, a masculine-looking man with a vagina)
  • a passing trans woman who has had surgery to get a vagina (MtF)
  • a passing trans man who has had surgery to get a penis (FtM)
  • a very androgynous person who you can’t tell whether is male or female

For most of the above, the question asked was how arousing the participants would find it to have sex with the archetype. However, for the final archetype the question was how arousing they would find to make out with those of the archetype.

I also included a number of nonandrogynous controls:

  • a physically fit man who likes to engage in sports
  • an ambitious career-focused man who has a high position in a business job
  • a nerdy man who is awkward and not very interested in people
  • a sweet/caring motherly woman, who wants to be a housewife and start a family
  • a female cheerleader
  • an artistic, feminine woman

Overall, I found I could squeeze four factors out of it: attraction to men, to women, to masculine women, and to trans/androgynous people.

Bass-ackwards analysis of the archetype items for men. The factor analysis can be seen here.

When looking at the data, I got the impression that there was some nonlinear structure that couldn’t be accounted for by the factor analysis. Perhaps it’s just bisexuals being more into androgyny, but it might be worth looking into in the future.


This is a rich dataset, and I’ve probably only scratched the surface. If there’s anything specific you want me to investigate, consider contacting me on discord via tailcalled#7006. I’m likely to also make further blog posts in the future on the basis of this dataset. This is a pretty big and aimless blogpost, so I have to find some way to end it, and I’m deciding to do so here.

1. Strictly speaking I used principal component analysis rather than factor analysis. PCA tends to yield nearly identical results to FA, but is computationally more readily available.

2. The items originate from a variety of sources. Some were included due to having been used with success in previous surveys. I suggested some because I wanted to study attraction to androgyny. Pasha included some to study “pairs” of self-related and other-related items (e.g. attraction to bimbos vs to being a bimbo). In order to get a broad sample, I also used GPT-3 to brainstorm items, with me picking the most plausibly relevant ones out of a big set.

Revisiting the instrumental variables strategy for testing AGP GD causation

Autogynephilia correlates with cross-gender ideation, gender dysphoria, and other gender issues. Usually Blanchardians attribute this to autogynephilia causing gender issues, but critics point out that correlation!=causation, and often argue that it is instead gender issues that cause autogynephilia, because someone who wants to be a woman would also want to engage in sexual activities as a woman and such.

A while ago, I had the idea that we could test the causal relationship between AGP and GD by looking at people who are more or less kinky. Specifically, the idea was that while some could imagine that wanting to be a woman would cause autogynephilia, it wouldn’t make much sense for it to cause kinkiness in general. Therefore, if we observe a correlation between kinkiness and gender issues, it would make most sense for this to be due to a kinkiness -> AGP -> GD effect, and therefore it would support an AGP -> GD causality. I found such an association, and therefore concluded that there was support to the AGP -> GD effect.

Shortly after I wrote the post, Michael Bailey sent me an email criticizing it by pointing out that applying instrumental variables in this way can be problematic, linking to a paper where he made the critique in detail. Which in retrospect is pretty obvious; I even emphasized these sorts of problems in my blog post, but perhaps I didn’t take them seriously enough, considering that I did still attempt to do this.

I think I’ve come up with a way to fix the method, and test the AGP -> GD effect in a much more solid way. This blog post intends to give an introduction to this concept; I still need more data before I can definitely test it, but I can use the previous data as an illustration.

Empirical causal inference in science 101

The main point of doing research is to uncover causal relationships. A common problem in science is that you’ve got two variables X and Y (in this case, AGP and GD), and you want to figure out the causal effect of X on Y. To solve this problem, a broad range of methods have been developed. Enumerating them all can be daunting, but luckily they mostly tend to follow a pretty consistent formula: To identify the effect of X on Y, you isolate some cause of X and look at how Y varies as this cause varies. So for instance, when you do a randomized controlled experiment, the cause of X that you isolate is your experiment, and then you look at how Y varies from your control group to your experimental group.,q_auto:good,fl_progressive:steep/
Most forms of quantitative causal inference between variables X and Y involve finding some cause Xc of X that doesn’t suffer from problems due to confounding or reverse causation. See the blog post for details.

The core assumption that this method makes is that the cause you isolate is not correlated with the outcome of interest, other than via its effect on X. Putting the case of autogynephilia and gender dysphoria into this framework, my strategy was to isolate general kinkiness as a cause of autogynephilia, and then look at how gender dysphoria varies between non-kinky and highly kinky people. But one could easily question whether the assumption holds here; for instance, you might suspect that people who are more sexually open-minded are both more kinky and more likely to want to be the opposite sex. Or really, lots of other things.

In particular, part of the problem is that “kinkiness” is a particularly difficult sort of variable to use for this approach. If I take the average interest across a wide range of sexual interests, then the variable I am measuring is “whatever things contribute to a wide range of sexual interests”. This is a pretty unbounded category of causes; while I have trouble thinking of any one single thing that would go into it (libido maybe?), it also seems unlikely that this is definitely going to be unconfounded. My plan after writing the blog post was to start investigating these sorts of hypotheses, searching for confounders and adjusting for them. But ultimately the problem is that you only need a very tiny violation of the assumptions to get wrong results, and therefore this is not a viable strategy.

This is a general problem with figuring out the AGP <-> GD causality

I investigated the causal direction using general kinkiness as a root cause, but there are other attempts to figure out AGP <-> GD causality that fall into the same general category, and which encounters the same problems.

Consider for instance time as a cause of autogynephilia. Kids are, for complicated evolutionary reasons, not very sexual, with libido instead firing up at puberty. As such, Blanchardians might want to use the contrast between childhood gender issues and adulthood gender issues as a measure of the contribution of autogynephilia.1 This can be critiqued in a lot of ways, but perhaps the best critique is to point out that it’s far from obvious that this is unconfounded. Puberty is also a time where a lot of sexual differentiation happens, and where gender-related topics become relevant in new and different ways, so it’s very far from obvious that this is an unconfounded measure of the effect of AGP.

Another example involves relationship status. An AGP researcher I’ve talked to argued that you could use the differences in autogynephilia and gender issues between times where an autogynephile is single and times where the autogynephile is in a romantic relationship to estimate the effect of autogynephilia on gender issues.2 The idea is that some autogynephiles feel that they are more autogynephilic when they don’t have a girlfriend. Leaving aside the issue that I am kinda skeptical of the effect of relationship status on autogynephilia, it seems far from obvious to me that relationship status doesn’t influence gender issues through other means. It seems to definitely influence the pros and cons of transitioning, and it seems like someone who has more opportunity to transition would also have a greater interest in doing so. Which makes relationship status an invalid variable to use to estimate these things.

I think the problem pops up all the time in these debates. HRT, random variation in GD over time, shifts in GD when seeing or thinking about sexy women, etc.. Almost all the back-and-forth arguing about the validity of AGP models comes down to the issue that we’re trying to parse out causality from a bunch of proxy related variables, without having a definite idea of how these variables function.

It is worth saying that the problem is not that we know some specific confounding variable that makes the tests invalid. Rather, the bigger problem is that we have no idea how the variables are related, so there could easily be tons of confounders and unintended mediators that we don’t understand. These sorts of methods shouldn’t be taken lightly, with all the arguments mindlessly thrown at the wall to see what sticks. Rather, we need to take a step back and identify some more well-justified method for studying this.

Kan være et billede af udendørs og tekst, der siger "ENDOGENEITY Me adding more controls to my regression"

Recently, I decided that this whole class of methods was inherently flawed for investigating things, and looked into alternate methods of causal inference, most notably analogy-based reasoning. For instance, one such argument would be “we know autogynephilia is a sexual interest, and sexual interests cause desires, rather than being caused by the desires”. But these alternate methods have their own new and exciting difficulties to struggle with, so I haven’t been able to do anything definite with them. But as I mentioned in the beginning of the post, I’ve come up with a way to fix the standard approach for causal inference, so let’s get around to this.

Maybe we should just investigate how our causes work

So back to the matter at hand. We want to know the effect of autogynephilia on gender dysphoria. So to do this, we look at the causes of autogynephilia, and identify general paraphilic tendencies as a cause. But the problem is, we don’t know how general paraphilic tendencies work, so maybe they have some hidden correlation with gender dysphoria (e.g. via sexual openmindedness) that make our tests invalid.

The problem illustrated diagramatically. Each node represents a variable, and the arrows represent causal effects while the lines represent unknown effects. GFP refers to the general kinkiness variable that we estimate by asking people about a bunch of unrelated paraphilias. ??? refers to hidden confounders that may make our analysis invalid.

In fact, if we knew the strength of the hidden correlation, we could just subtract it off in order to make our tests valid again. There’s some asterisks here that should be taken into account, but I think it’s at least a promising path forward. But that raises the question, how do we figure out the hidden correlation between general paraphilia and gender issues?

The obvious way to figure out whether this is the case would be to correlate general paraphilic tendencies with gender issues. If there is some sort of connection between them, then that connection should show up as a correlation between the two. But of course the problem is, the connections between paraphilias and GD also include the kinkiness -> AGP -> GD connection, which is precisely the connection that we want to estimate. We would end up subtracting the correlation from itself, yielding zero no matter what.

So is there some way that we can figure out the kinkiness <-> GD correlation, minus the kinkiness -> AGP -> GD path? Here’s my idea: Just look at the correlation between kinkiness and GD in non-AGP men. If the men aren’t AGP, then the kinkiness -> AGP -> GD path cannot be in play. Next, subtract this off from the correlation between kinkiness and GD overall, and you get your causal estimate.

Rather than investigate the associations among all men, we can simply investigate the associations among non-AGP men. This doesn’t include the kink -> AGP -> GD path, allowing us to investigate potential confounders.


I figured this method out a while ago now, and I had actually intended to do a separate survey to collect new data to test it. But then I started getting distracted, and I figured, hey, I’ve got the previous porn survey that I originally tested this method in, I might as well try it again on that data. Later we will discuss some reasons why this survey isn’t ideal, but it seems like a reasonable starting point.

So a bit of background, the dataset I’m going to analyze comes from a survey I posted to /r/SampleSize, titled “[Casual] Can you look at some porn For Science? Survey #5 (18+) NSFW”. In the survey, I showed people various erotic images containing men and women doing various erotic things. In addition to this, I also asked a number of questions, including questions about sexual interests and gender issues. I got about 1000 male responses, making it quite a large sample size. Which is good, because this method is incredibly data-intensive.

To measure general paraphilia, I had some items measuring sexual interests by asking about arousal on a rating scale from “Not at all” to “Very”. I took the average response to how aroused the participants said they would get by the following themes (alpha=0.52):

  • Being tied up by your partner
  • Exposing my genitals to an unsuspecting stranger
  • Watching a video of yourself masturbating
  • Having an older sexual partner take on a dominant parent-like role in the relationship
  • Imagining having sex with an anthropomorphic animal (furry)
  • Caressing your partner’s feet

To measure autogynephilia, I took the average response to how aroused the pariticipants said they would get by the following themes (alpha=0.81):

  • Imagining being the opposite sex
  • Wearing clothes typically associated with the opposite sex (crossdressing)
  • Picturing a beautiful woman and imagining being her
  • Wearing sexy panties and bras
  • Imagining being hyperfeminized, i.e. turned into a sexy woman with exaggeratedly large breasts and wide hips

Those who answered “Not at all” to all of the above were categorized as non-AGP (n=316), while the remainder were classified as AGP (n=828).

To measure gender dysphoria, I had some items that asked about how masculine/feminine the participants were, with a rating scale going from “Disagree Strongly” to “Agree Strongly”. Among those, I used the following two to assess gender issues (alpha=0.61):

  • As a child I wanted to be the opposite sex
  • I feel I would be better off if I was the opposite sex

Among non-AGPs, the correlation between GFP and GD was 0.02 (with a standard error of 0.06 according to bootstrap). This could be taken to indicate that there was no confounding between GFP and GD at all, though make sure to read the rest of the blog post to see an asterisk with this interpretation. Among AGPs, the correlation between GFP and GD was 0.15 (SE 0.03). Therefore, subtracting them yielded a correlation of 0.13 (SE 0.07).

This 0.13 number is pretty low, but it is the value for the GFP -> AGP -> GD path, not for the AGP -> GD step of it. To get the latter, I divide out by the GFP <-> AGP correlation among AGP men. This is a correlation of 0.37 (SE 0.03), yielding a value of 0.35 (SE 0.2) as the causal effect of autogynephilia on gender issues among autogynephiles.

This effect is technically not an effect for the whole sample, but instead only among the subset that are autogynephilic. I can assume it simply linearly extrapolates to the entire sample, in which case I get a total effect of 0.36 (SE 0.2). If I subtract this off from the original correlation of 0.45 (SE 0.03) between autogynephilia and gender issues, that leaves an effect of 0.1 (SE 0.2) that isn’t explained by the AGP -> GD effect. So this examination indicates that 80% of the correlation between autogynephilia and gender issues is causal AGP -> GD.


And here’s the bad news: the effect of 0.36 is not statistically significant. That’s not to say that it’s too “small” to be important or something like that. Rather, statistical significance is a technical term used to describe when the sample size is big enough that it would be hard for the result to have been achieved by chance, just from randomly picking people who happen to align with the theory. In order for a result to be statistically significant, it must be the case that if there were no effect, you’d only get results as extreme as that result 5% of the time. But that would require our effect to be greater than 0.4, which it is not.

The good news is that the remaining correlation of 0.1 also wasn’t statistically significant. It would have to exceed 0.38 to be significant, which it very much did not.

What this lack of significance means is that this survey isn’t the final step in the story. We need to collect more, bigger data. Compared to just going with the direct correlation, this method needs very large sample sizes. I would estimate that this method requires about 15x as many participants as the more straightforward methods, though it depends very much on the details.

We also need better data. The paraphilia and gender issues measures used in this survey were very low-quality. I’ve been working on better measures, but I could still use improvements. The autogynephilia measure is also kind of ad-hoc, and could benefit from more coherence and thought.

It may also help to get more controls. If we can better account for other factors that influence gender dysphoria, then that can let us estimate the effects more precisely for autogynephilia. It may also be that we can somehow combine this with my analogy-based methods to improve things.

It should also be noted that this method can be used for other things than autogynephilia theory too. For instance, it could likely be used to test the “autoandrophobia” theory that is often brought up by critics of autogynephilia. This theory is rarely explicated, but I did once talk with a trans woman who gave me her idea of it. In that variant, people end up with certain random things that they are disgusted by, similar to how people end up with certain random things that they find erotic; and if one then ends up finding having male traits to be disgusting, then that would cause gender dysphoria. This theory could be tested by replacing the general factor of paraphilia with a general factor of disgust sensitivity, and replacing autogynephilia with autoandrophobia.

Finally, let’s take a discussion of the potential problems and assumptions with this method. This is going to get technical, so I guess be warned about that. After the discussion of problems.

Conditioning is not a counterfactual

This first point is kind of abstract, so let’s instead discuss my favorite statistical paradox, Berkson’s paradox. I like the examples given in this twitter thread: Why are handsome men jerks? Why don’t standardized test scores predict university performance great? Why are movies based on good books usually bad? Why are smart students less athletic? Why do taller NBA players not perform better at basketball?
Stolen slide illustrating Berkson’s paradox. By selecting a subset of the population, you introduce a negative correlation between the variables you select on.

If we filter our sample on the basis of some set of variables, then that filtering introduces a ton of spurious correlations between all of the variables that are upstream of our filtering. The usual pattern will be negative correlations between the causes, but we might have other things going on, depending on the specific details.

So when we compute things for the non-AGP and AGP men separately, we may very well introduce some additional correlations that don’t correspond to anything real. How big of a problem is this? Lemme give you my threat model, to evaluate what happens.

Threat model: AGP merely reflects a kinky way to express gender feelings. The association between the GFP and GD is not due to GFP -> AGP -> GD, but instead due to some underlying common cause, e.g. sexual open-mindedness or something abstract like that.

The most common critique of the AGP->GD hypothesis is to claim that it makes more sense for there to be a GD->AGP effect. If we then filter for those who are not AGP, then that seems like it should lead to exactly the sorts of classical Berkson’s paradox effect that I’ve brought up here: You would only be included in the sample if you are not AGP, which you would be unlikely to be if you were both kinky and GD, so you’d have to either be neither kinky nor GD, be only kinky, or be only GD. Further, if you were only GD, then you would probably need to be less kinky than average to cancel it out, while if you were only kinky, then you would probably need to be less GD than average to cancel it out. So this could explain why we got a correlation of 0.02 between kinkiness and gender issues among non-AGPs; maybe the “true” correlation was higher, but it was masked by the filter effect.

So that seems like a problem. But, this isn’t the only filtering we did. We also looked at the correlation between AGP and GD among AGP men, and subtracted off the correlations from each other. Thus, if the Berkson’s paradox effect is equally big for both of them, it should cancel out. Could that be the case? And if it isn’t the case, could we estimate the discrepancy and adjust for it?

Here’s one condition where it would be the case: All of the variables are normally distributed and linearly related, and when we filter for non-AGP men, we take the men who have below-average amounts of AGP, while when we filter for AGP men, we take the men who have above-average amounts of AGP. Because we’d then be filtering equally strongly when we took the below-average and above-average AGPs, it would exactly cancel out, and there would be nothing to be concerned about.

The problem with this condition is that it’s obviously wrong. For instance, the distribution of AGP looks like this:

That looks extremely non-normal to me.

But there are many ways that it could be rescued. Suppose, for instance, that you believe the participants see being a woman as having some degree of eroticism, which may be negative or positive, and suppose that a man ends up AGP if he sees being a woman as having a positive degree of eroticism. In that case, you’d expect to see some sort of distribution similar to the above, where there’s a large spike around 0, and a distribution above this. Further, if you believe that there are many factors that influence the latent eroticism (and you almost must, considering that we can’t find any factors that predict AGP), then it seems reasonably to suppose that this is normally distributed, as tends to happen in polyfactorial cases due to the central limit theorem. So in this model you would have AGP expressed as follows:

AGP = max(0, kinkiness + gender issues + ..?other factors?..)

An alternative would be a conjunctive model. The previous model assumes that if there is some factor that influences the latent eroticism of being a woman strongly enough, then that factor alone can cause AGP, by overpowering the other factors. But what if instead you think that factors need to interact to cause AGP? A simplistic example might be that if you are AGP if you are kinky and open to being a woman; but other more nuanced models are possible. Here you would express AGP as a product:

AGP = kinkiness * gender issues * ..?other factors?..

(Here, all of the factors would need to be positive; otherwise you get bizarre inversions where if a factor gets negative then all of the other factors end up having the opposite effect.)

It turns out that these models are approximately isomorphic! Specifically, first notice that the maximum function and the exponential function have approximately the same shape for small input values:

Shapes of the maximum function and the exponential function.

Therefore, we can approximately replace the first model with the following:

AGP = exp(kinkiness + gender issues + ..?other factors?..) = exp(kinkiness) * exp(gender issues) * …

Applying the exponential function to the other factors is exactly what is necessary to turn them strictly positive, as is expected by the conjunctive model. Overall I’ve spent a lot of time thinking of different models for how things could interact, and most of them seem like they end up approximately isomorphic to this model (though I’m open to hearing counterexamples if you have any), so I think it’s probably okay to use.3

So to recap, what this implies is that the Berkson’s paradox effect will be equally big if we filter equally hard on the AGP category and on the non-AGP category, which will happen if we have equally many in each of the categories.

And that’s actually part of the problem with the porn survey. I had 316 men in the non-AGP category, and 828 men in the AGP category, so that means only 28% of the respondents were non-AGP. Meanwhile, in the general population, about 3%-15% of men are AGP and the rest are non-AGP. So in neither case, I would end up with an even split. However, on reddit, the proportion of AGPs is actually often quite close to 50%, so it might be doable there. (I’m not sure what happened in the porn survey – I suspect it’s just that AGPs are horny.) Otherwise, it might also be interesting to look into whether there are any mathematical ways to adjust for the asymmetry.

Nonlinearities kill

Part of the assumption made in this method is that whichever confounders there may be between kinkiness and gender issues work the same way in AGPs and non-AGPs. If this is true, then I think the approach is in pretty good standing. However, what if they don’t? Suppose for instance that we have some sort of situation like this:

That is, suppose gender dysphoria is caused by some sort of neurological feminization (it’s not particularly important that this is so, but I had to pick some concrete variable), and suppose that gender issues arise from this. But suppose further that sexual openmindedness (or whatever, the particular variable isn’t very important) moderates this effect, such that the effect of ladybrains on gender issues is stronger for those who are sexually openminded (maybe the others repress, or are unwilling to admit their gender issues, or whatever).

In that case, AGPs would be more likely than non-AGPs to have ladybrains, and therefore the confounding between GFP and GD would be stronger for them. Which would lead to my method concluding that AGP causes GD, even though in this case it doesn’t.

It would probably be a good idea to evaluate how sensitive this method is to nonlinearities. In additions, ways of making it more robust should be evaluated. Further, in the context of nonlinearities, it should be noted that the method sort of relies on something nonlinear-like going on. I split on the basis of AGP vs non-AGP, with the logic being that the GFP can’t influence AGP among non-AGPs. But for there to be some context where the GFP can’t influence AGP, there must be a nonlinear relationship between the GFP and AGP.

Estimation shenanigans

When I computed the effects, I did all sorts of subtractions of correlations and such from each other. This isn’t strictly valid; the correct way to adjust for the confounding between GFP and GD depends on the nature of how the confounding works, leading to a spectrum of possible adjustments. Furthermore, if variances differ (for instance, there’s more variance in AGP among AGPs than among non-AGPs, as non-AGPs have 0 variance in AGP), then using correlations rather than regression coefficients is invalid.

In fact, if I take this last point into account and reevaluate the coefficient from the data, then I get an effect size of 0.38 (SE 0.17), which just barely manages to be statistically significant. But this isn’t the only estimation shenanigan I did, and in order for the results to be believable, it would be good to go through and see if the estimation can be made more accurate. In cases where we don’t have sufficient information to make it more accurate, we should try varying the assumptions to see how sensitive it is to them.

Overall, due to all of these complications, this should merely be seen as a proof of concept, and not necessarily as a finished, definite solution. But I think the trick I presented in this post, of comparing the effect in AGPs and non-AGPs, make me more open to the possibility that this class of methods for causal inference may be workable for deciding the validity of AGP->GD causality.

1. From the perspective of Blanchardian theory, what would be most convenient would be if AGPs didn’t have any childhood gender issues at all, because this would seriously cast doubt on the possibility of GD -> AGP. However, pursuing this argument is not very viable, because when pressed, Blanchardians admit that often times, AGPs do have some gender issues in childhood.

Blanchardians argue that this may be analogous to how children sometimes end up with childhood crushes, with the childhood gender issues corresponding to a sort of romantic ideation. Which, sure, whatever, seems like a fair enough possibility. But it complicates the idea of using time as a cause of autogynephilia for causal inference, and Blanchardians should stop making this argument.

2. The idea behind this proposal is that autogynephilia and gynephilia “compete”; at times where someone is more sexually engaged with women, they don’t have enough “left-over attraction” to be attracted to being women. I have not seen much convincing theory or hard data supporting this; as far as I can tell, it’s solely based on some clinical anecdotes. I don’t really buy it, which makes me extra critical about using it to estimate these things.

3. One interesting thought that comes up here is the question of, if there’s a continuous liability of eroticizing being female, is it really only the positive part that affects things? For instance, you could imagine that the negative part represents finding AGP themes to be a direct “turnoff”. But the estimation method I came up with ends up assuming that there is no effect in the negative part of the spectrum, and attributing any effect there is found to confounding. From a theory point of view, if there is such a thing as “negative AGP”, then that would obviously disprove Blanchardianism.

Autogynephilia and masochism: A tale of two assessment biases

Conventional wisdom says that autogynephilia and masochism form a particularly closely correlated set of sexual interests. Recently, I’ve been arguing that actually, conventional wisdom seems wrong, and might be an artifact of some assessment biases. However, I’ve now also come to find a bias in the opposite direction in the dataset I’ve usually used to argue against it, so this puts some nuance on things. I still believe that conventional wisdom is wrong, as I will argue here. Let’s take a look.

In what ways does autogynephilia look correlated with masochism?

Autogynephilia supposedly looks correlated with masochism. Let’s take a look at some specifics.

Probably the obvious authority to look to would be Ray Blanchard, who coined the term “autogynephilia”. One paper written by Chivers and Blanchard found that prostitutes who advertise for crossdressers are those who advertise themselves as dominant. In his book on GID, he also referenced a number of lines of evidence, include common findings that men engaging in erotic asphyxiation are crossdressing; other observers finding especially in case studies that there was an overlap between autogynephilia and masochism; and a study of members of various kink societies finding a great deal of overlap between groups.

A lot of autogynephilic erotic material also seems to take on a masochistic form. There are themes such as forced feminization, in which males are made to crossdress and behave femininely against their will. Another stereotypical form of erotic media for AGPs is transgender transformation, and communities for these are often also masochistic or at least submissive.

Autogynephilia in particular pops up in the case of trans women, and so it might be worth considering what trans women are like. Compared to other groups, trans women seem particularly masochistic:

Self-reported masochism in different groups of people. As can be seen, trans women are particularly masochistic, indicating a connection.

The results seem pretty consistent; how might they possibly all be wrong?

The many problems with these methods

If we were merely interested in whether highly-AGP communities also tend to be more masochistic than average, then the above would be pretty definitive. However, we are interested in something more subtle, and the previous associations are completely useless at determining this more subtle thing.

I would claim that what we are really interested in is whether autogynephilic males – i.e. males with a sexual interest in being women – are particularly likely to be masochistic compared to whichever other ways they might be paraphilic. This leads directly to the first problem with the previous findings: All paraphilias correlate; an AGP man isn’t just more likely to be masochistic, he is more likely to be everything.

Correlation between arousal to a variety of paraphilias in a survey run by the admin of /r/AskAGP. Note how AGP correlates with everything from swinging to pee fetish to furryism to incest. (Beware – do not take this data too seriously; see the rest of the post for more details.)

This is called the general factor of paraphilia (GFP), and it complicates any attempts at reading anything into correlations between AGP and masochism; obviously they’re going to correlate, since everything correlates, but that’s not exactly theoretically interesting.

The general factor of paraphilia is one potential bias, but it is not the only potential bias. There may, for instance, also be community-based effects. If you are examining trans women’s masochism to test the relationship between autogynephilia and masochism, you are assuming that masochism is not related to transsexuality except via autogynephilia. Is that assumption true? Do we know that masochism isn’t confounded with transsexuality in some other way? It seems like quite an unfounded assumption to me; before one starts using transsexuality to study it, one should first establish that this assumption is justified. (I’ve tried to see if masochism predicts gender issues after controlling for AGP in my surveys, and I’ve gotten mixed results so far. Needs more research.)

Or consider another option: founder effects. Suppose that some community of AGPs is founded by a masochist. In that case, they might share masochistic AGP material, and this would influence which sorts of other members end up joining. Or how about sociological effects. Suppose that males tend to be embarrassed about their AGP. Generally, that may make them avoid getting associated with it, avoid communities and such – except, if you’re a masochist, the embarrassment is in some ways a plus. So masochistic AGPs might become more likely to join AGP communities. These factors are of course speculative, but the point is, you’re making a lot of assumptions when examining AGP communities.

A direct test of the AGP/masochism relationship

The trouble with the general factor of paraphilia can be solved by looking into not just masochism, but also other paraphilias, and examining whether masochism correlates with AGP above and beyond what one would expect paraphilias to correlate generally. The other biases mentioned are essentially biases due to selection, proxies, stereotypes, and such; they can be straightforwardly solved by changing the recruitment methods. Namely, instead of recruiting an AGP sample and a non-AGP sample and comparing them for masochism, one recruits a single sample that contains both AGPs and non-AGPs and examines them.

This is how we collected the data described in the blog post Controlling for the general factor of paraphilia. We selected some items assessing not-inherently-masochistic autogynephilia (using some of the top fantasies mentioned in the qualitative survey), masochism, masochistic autogynephilia (mainly forced feminization), and a variety of paraphilias for controlling for the GFP. We then posted this on /r/SampleSize using the title “Male Sexuality Survey”, and got a dataset with a large number of responses. I’ve analyzed this dataset in a bunch of ways, and they all tended to yield no real connection between autogynephilia and masochism. The blog post linked before attempts to examine things at the level of individual items, but I’ve also gotten similar results with more abstract models.

Mysterious results

However… there is something fishy about the data. Consider this matrix from the analysis:

Correlations between sexual interest items after subtracting off the GFP. See Controlling for the general factor of paraphilia for details on how this was computed.

Notice how the correlation between autogynephilia and masochism is negative, -0.09. This is a pretty small effect, and so one probably shouldn’t make too much of it on its own, especially since the method gets kind of fiddly. (Some of my other analysis got a statistically insignificant positive effect, 0.05.) But it’s still odd. And before controlling for the general factor, the correlation between autogynephilia and masochism was just 0.1; this too is odd, as usually I get correlations between them of 0.2.

This isn’t the only odd part of the data. Consider for instance the distribution of answers to the autogynephilia item:

Distribution of responses by sexual orientation. Attraction to men/women was dichotomoized based on rating the given gender with 2+ on a 0-4 scale, so some of the monosexual men were monoflexible rather than strictly monosexual.

This is an incredibly high affirmative response rate compared to the general population, which appears to be more like 3%-15% (see e.g. Lawrence’s review for some general population estimates). Most of this effect is due to reddit being unusually paraphilic; in my experience, reddit and other eccentric internet samples often end up with autogynephilia rates around 40%-50%. But the affirmative responses here are far higher than that, so it seems that something more is going on.

A final piece of fishiness comes in the survey comments. Many people responded that they found the questions weird, overly focused on masochism or on imagining being a woman. It is not very surprising that they felt that way, considering that about half the questions were focused on these themes or on attraction to trans women and crossdressers.

Berkson’s paradox

To understand the problems with this fishiness, let’s switch to an entirely different set of questions: Why are handsome men jerks? Why don’t standardized test scores predict university performance great? Why are movies based on good books usually bad? Why are smart students less athletic? Why do taller NBA players not perform better at basketball?

Stolen slide illustrating Berkson’s paradox. By selecting a subset of the population, you introduce a negative correlation between the variables you select on.

A huge amount of the questions in the survey were based on autogynephilia and/or masochism. Quite plausibly, men who were neither autogynephilic nor masochistic found the survey to be strangely obsessed with these topics, and therefore chose to leave. If this happened, we’d expect to get a negative correlation between autogynephilia and masochism, at least after controlling for the GFP. If this happened sufficiently strongly, we might even have masked a positive correlation between AGP and masochism.

New data, new analysis

I informed the creator of the /r/AskAGP subreddit about this problem, and we went to designing a survey that had few AGP/masochistic items, and plenty of paraphilic and normophilic items, to get rid of these selection effects. We used the same method for collecting the data as before, posting it on /r/SampleSize. You can compare the correlation matrices here to see the overall survey content and results structure.

I proceeded as in my Controlling for the general factor of paraphilia post, and identified a varied set of paraphilias to use for estimating the GFP:

Correlation matrix for paraphilias in the second survey.

I can fit a general factor model to this. The general factor model estimates, for each paraphilia, how much it “generally tends to correlate” with other paraphilias. Using the general factor model, I can then decompose the correlation matrix into the part that is due to the general factor, and the differences in predictions from the general factor:

Decomposition of the correlation matrix.

It is the latter residual correlation matrix that is interesting. It tells us the correlations between paraphilias beyond the general pattern of them all being correlated. To interpret it, let’s zoom in and look at the labels:

Results from subtracting off the general factor of paraphilia.

There was no correlation between the autogynephilia item (“Imagining being a woman and masturbating by rubbing your clitoris”) and the masochism item (“Having your partner call you slurs or insults”). Autogynephilia did seem to have a negative residual correlation with some times related to sexual dominance; this suggests that autogynephiles might be less sexually dominant after taking general paraphilias into account. One might naively take this to mean that there is a connection between autogynephilia and submissiveness/masochism, but this is not so; sexual dominance is not the opposite of sexual submission, but instead often positively correlated.

The autogynephilia item used here is a kind of eccentric item optimized to test a hypothesis I had. This hypothesis has since run into some empirical problems and now I have some theoretical concerns about it. Therefore, I thought I should test the robustness by replacing it with other more-standard autogynephilia items, namely “Imagining being a woman and having sex with another person” or “Wearing sexy panties and a bra”:

Robustness check; I tried experimenting with the full set of items, still controlling for the GFP in similar ways to before. (There was one difference to the way I controlled here in this robustness check; see the end of the post for technical details.)

As you can see, it remains robust to choice of item. Items intended to examine masochistic feminization and emasculation did correlate with masochism, but items intended to examine purer autogynephilia did not, after controlling for the general factor of paraphilia. In fact I was a bit surprised here, I had thought that transvestism would correlate; it might be good to perform more research on this.

But overall, my conclusion is, while the initial investigation had its problems, it still appears to me that autogynephilia is not particularly correlated with masochism/submission. Its apparent negative residual correlation with dominance might perhaps explain perceptions that autogynephilia is correlated with masochism/submission, because autogynephilia is associated with non-dominant paraphilias.

Now for some technical details. The method I used to control for the general factor of paraphilia works by minimizing the size of the residual correlations; see the linked blog post for details. However, this becomes a problem when I use many items with similar content at the same time, such as in my final image that showed all the AGP/MEF items; because then instead of controlling for the GFP, it might end up controlling for the specific items content related to those. To address this problem, in the final image I avoided minimizing the correlations between the AGP/masochism/MEF items, and instead just minimized their correlations with the rest of the items that were selected.

More formally, here is the code for the loss function:

# p x p correlation matrix for the paraphilia data
corrs = np.corrcoef(data.T)
# data is an n x p matrix containing n responses on p variables

# we are concerned that some of the beginning items have excessive content overlap,
# and that their shared variance should therefore not be used to estimate the GFP.
# to avoid this, we skip some of the items. this variable contains the
# number of items to skip; in this case 3 AGP + 3 masochism + 3 MEF = 9 items
head = 9

def loss_fn(loadings):
    # matrix containing the correlation due to the general factor
    gfp = loadings @ loadings.T
    # 'residual' matrix after controlling for the general factor
    residual = corrs - gfp
    # we want to minimize the off-diagonal elements of the residual matrix, i.e. this:
    err = (1 - np.eye(len(items))) * np.abs(residual)
    # I use an L1 loss to favor sparsity, but I also add a bit of L2 to improve convergence
    err = err + 0.01 * err**2
    # we drop the first few associations to avoid the previously mentioned problem of content overlap
    return np.sum(err[head:,:])

I optimize this function to get the general factor loadings.

Autogynephilia is vaguely defined

This is a point that I find myself repeating, so I thought it would be a good idea to write it up in a blog post. A kind of annoying problem that I run into when reasoning and communicating about autogynephilia is that, well, it’s not very well defined.

As everyone reading this likely knows, autogynephilia originates in the study of transsexuality. Trans women were observed to have various peculiar sexual behaviors and fantasies, with the most common observation being transvestic fetishism. However, Blanchard noticed that usually, it didn’t seem like the point of it was the clothes, but instead that it was more about being a woman, in that the fantasies often included imagining having female anatomy and such. Thus was born the idea of autogynephilia; a sexual interest in being a woman.

The problem I have is, autogynephilia is always studied under weird conditions. For instance, suppose you study some cultural group, such as men from a crossdresser club, or self-identified autogynephilic trans women. In that case, you’re not just studying autogynephilia; you’re also studying the cultural group, with all of peculiarities that come with it. It seems to me that there is very little self-awareness in Blanchardian circles that this is done. I think this is a big contributor to the problems mentioned in this blog post, but it’s probably not the sole contributor.

The boundary of autogynephilia is vaguely set

Anyway, there’s some phenomena that get lumped under autogynephilia, but what defines the extent of the term? It clearly must include transvestic fetishism and anatomic autogynephilia, since this is historically what it derives from and since these are still to this day considered to be core signs of autogynephilia.

But consider my observation that the most common fantasies among autogynephiles appear to be of engaging in fairly standard sexual scenarios as women; should this be considered autogynephilic? Presumably not necessarily, but presumably yes under some conditions, right? Maybe if the focus is on your own female body during those fantasies? Maybe if you have them as a male who is not intending to transition? I would say yes to both of these.

Or let’s go back to the question of anatomic autogynephilia. What exactly is the appeal here? Being more feminized, being a woman, being attractive according to your own preferences, something else? These are pretty similar, but they get subtly different in edge-cases. But these edge-cases aren’t necessarily super obscure; while “being more feminized” and “being a woman” might have a lot of overlap among cis men, it may have less overlap among some trans women, who live and look like cis women. For them, is autogynephilia about being hyperfeminized, or about their current lived body, or about something else? There’s a lack of well-defined sufficient and necessary conditions here.

The phenomenon of autogynephilia is vaguely characterized

I think autogynephilic fantasies in cis men is pretty well-defined. While I’m kind of critical about traditional descriptions of it (like the five-type anatomic/transvestic/interpersonal/behavioral/physiological distinction) that focus on more extreme cases in more obscure communities, if you combine those descriptions with the qualitative survey I did, you get a pretty broad but also practically grounded idea of what autogynephilia can involve there.

However, what about trans women? Trans women often report changes in the focus of their sexuality with transition, but it’s not very well documented what goes on there. Some trans women are open about autogynephilia, but it seems logical to conclude that those who are less in denial about it also have more extreme forms that are harder to deny. Plus, the autogynephilia they are open about may intersect with other sexual interests. Despite being to a large degree a theory about trans women’s sexuality, we don’t know how autogynephilia presents in trans women.

Ideally, we would have in-depth, qualitative study of post-transition AGP trans women’s sexuality, without introducing selection bias towards those who are more openly AGP. This has not been done; it’s probably pretty hard to do perfectly, but it could probably be better approximated than has been done currently.

Or let’s go back to the cis men. Sure, their autogynephilic fantasies might be reasonably well-understood. But it’s often proposed that autogynephilia involves other elements, some of which may be more attachment-based or emotional. While these have been anecdotally documented by various people working with crossdressers, these are again going to be a sample selected for being more extreme in this dimension. These are things that need to be understood better.

The theory of autogynephilia is vaguely understood

There’s really a whole bunch of different ways that autogynephilia gets characterized across contexts. Some of these are subtly different, while others are very different. Let’s try to list some conceptions:

  • Inverted gynephilia: Gynephiles are attracted to women in a wide range of ways, from seeing them to having sex with them. For autogynephiles, this attraction is inverted, such that whatever they as a gynephile would find attractive for a woman, is something the autogynephile they would find attractive as a woman.
  • Symbolic autogynephilia: There are various things that people associate with womanhood, such as feminine clothes, female bodies, exaggeratedly feminine or female-only things, etc.. Autogynephiles are attracted to associating themselves with the things that they associate with womanhood.
  • Autosexual gynephilia: Autogynephiles are attracted to themselves as women. They have some image of themselves as women in their mind, or they actualize this image in their body or presentation, and they are sexually attracted to this in much the same way as anyone might be sexually attracted to any person.
  • Mimicry autogynephilia: When autogynephiles form images of or encounter attractive women, their target ends up being like them. It’s not really about themselves as women, because if they were female, they would not be focusing on themselves, but instead be focusing on other women that they would continue to desire to be like.
  • Genderbending autogynephilia: Autogynephilia are interested in transforming from men to women. This is part of a more general interest in genderbending and androgyny, as can be observed from e.g. their attraction to transsexuals.

To me, inverted gynephilia seems to match the fantasies in the qualitative survey pretty well. And it also seems like the simplest and most plausible model. But there are some aspects and presentations of autogynephilia that it doesn’t account well for. Another possibility is that autogynephilia isn’t really one thing, but is instead a mixture of all of the above.

Blanchardians don’t really seem to have converged on an option here. A lot of these could go under the label of “erotic target location error”, but probably inverted gynephilia is the most straightforward translation of this concept. When Lawrence insists that it is the “mere thought of being a woman” that is interesting to AGP, that more seems to resemble symbolic autogynephilia.

I think pinning this down would actually be pretty important to making the theory more describable to trans women. While it’s easy enough to describe how it functions in cis men without resorting to theory, just by describing the observed phenomena, this might not straightforwardly extrapolate to trans women, as e.g. all of the previous theories give relatively similar results when applied to cis men, but wildly different results when applied to trans women.

To some degree, if we just studied the phenomena in trans women, we could ignore the theory and just present trans women with the resulting phenomena. But this feels a bit circular; ideally we would be able to deduce, from theory of what autogynephilia is, how it presents in trans women.

How to fix this

To some degree, this is a question of definition. As in, probably setting the boundary will always be a bit arbitrary, and there’s no absolute right and wrong way to do it. However, setting useful boundaries will be easier if we have some more data, so that we have a good idea of what we are including and excluding. And data can also help guide theory, which can help guide definitions.

In various places, such as the book Men Trapped in Men’s Bodies, there’s already some documentation of how the most striking examples of autogynephilia in trans women might work. But this book was specifically created by asking trans women to submit their own stories and self-descriptions; this runs into the problem that trans women who have subtler presentations of autogynephilia might be more likely to deny to themselves that they are really autogynephilic, and so fail to get included in the book. But this means that when autogynephilia is understood in terms of what is included in the book, the trans women with subtler presentations of AGP may find it hard to relate. So since there’s already plenty of striking examples, the goal should more be to spend time collecting standard, more-representative examples of autogynephilia in trans women.

And in cis men too – I’ve got the qualitative survey, but it’s really kind of limited. Most of the respondents only presented one autogynephilic fantasy, but presumably they have more. Ideally we would get a more concrete idea of the general range of their sexuality, by exploring their interest in a broader variety of fantasies. This could tell us something about the degree to which different fantasies “go together”, and maybe get us closer to “necessary and sufficient” conditions for AGP.

Retracting the response to Contrapoints

A while ago, I published my Response to Contrapoints on Autogynephilia. Broadly speaking, I argued that Contrapoints did not address the Blanchardian presentation of autogynephilia, but instead knocked down a strawman argument. I broadly stand by this aspect of the critique; however, as a core part of my response, I made a deeply flawed argument that makes me unable to stand by the blog post. Let’s talk about it here.

The core disagreement

During transition, the primary feeling trans women seem to feel about it doesn’t usually seem to be horny. There might be exceptions, but it generally seems to me to be more something like “I hate being male, and being a woman is the most important thing to me”, or something like that. A core part of the disagreement is that trans women think this can’t be explained by something like autogynephilia, which is an aspect of sexuality.1

The standard Blanchardian response to this is the romance hypothesis: Paraphilias are an aspect of your sexuality, yes, but humans are built to form emotional attachments around their sexual interests. The romance hypothesis asserts that it would make sense for whatever systems of attachment that exist to be the source of the trans women’s strong non-horny feelings about their sex.

I agree with this alternative proposal, and I continue to think it was wrong of Contrapoints to just causally dismiss it. She treated it as absurd, and as “moving the goalposts” when really it’s a perfectly logical part of the theory that has been there from early on. However, before I defended the romance hypothesis, I brought up an alternative proposal, and this alternative proposal is a big problem.

Could autogynephilia contribute to gender issues in other ways?

My proposal was that maybe neither the romance hypothesis nor horny were the main contributors, but that instead there could be other possibilities:

But that ignores the point: isn’t the romance hypothesis too weird? Maybe. But even if that is true, there are lots of other ways we could imagine that autogynephilia could contribute to gender dysphoria:

1. Many trans women seem to feel that men and manhood are just objectively terrible in some way. At 35:30, Contrapoints talks about “the evil magic of testosterone”, and at 19:49 she talks about how she thought of PiV sex as “getting the poison out”. Perhaps autogynephilia predisposes one to developing negative psychological complexes about men or manhood.

2. Autogynephilia leads to repeatedly getting psychological reinforcement from the thought of oneself as a woman. Thus, fans of conditioning-based theories might suggest that autogynephilic stimulation carves the desire to be female into the brain slowly over time.

3. Especially today, many with autogynephilia will have it suggested that they are “eggs” (slang for self-closeted MtFs) and should transition. Perhaps obsessing over this question (especially when committing more and more to a transgender identity) leads to gender dysphoria.

Now, this should not be taken as an endorsement of any of the above possibilities. I consider it to be something of an open question how exactly AGP leads to gender dysphoria, and it’s something I’m trying to study. However, the point is that the options aren’t just romance hypothesis, purely sexual, or no typology. This is the exact sort of false dichotomy (trichotomy?) that Contrapoints herself criticizes at 14:20.

I do qualify that none of the specific possibilities that I list are particularly likely, but I bring it up to imply that the sum over all of the conceivable and inconceivable possibilities is likely. Is that justified?

Sort of. There’s no absolute law anywhere which states that autogynephilia cannot have all sorts of random effects that contribute to gender issues. Maybe it’s true that there are all sorts of other contributors; but why believe it? Couldn’t they go either way, reducing the effect of autogynephilia rather than increasing it? Couldn’t they just as well be confounds, correlating with autogynephilia due to some shared factor, making the correlation between autogynephilia and gender issues an invalid estimate of the effect of autogynephilia on gender issues?2

Assuming that autogynephilia could cause gender issues through arbitrary mechanisms makes sense only if we strongly believe that autogynephilia causes gender issues without knowing what the mechanism through which it causes them is. In that case, the knowledge that autogynephilia causes gender issues is reason to suspect that any given conceivable mechanism contributes.

But how would we know that autogynephilia causes gender issues without knowing the mechanism in the first place? This is basically just rationalization, making up a story to fit it, exactly the sort of shifting the goalposts and ridiculous theories that Contrapoints accused the romance hypothesis of being. Sure, it’s conceivable that it’s right, but it’s also conceivable that the devil planted the evidence for autogynephilia in order to deceive us; that doesn’t make it likely.

How do we know that autogynephilia contributes to gender issues? Because generally sexuality gives you preferences, not the other way around. You don’t find food sexy when you are hungry, because that’s not how sexuality works; you do wish to be a woman if you are into being a woman. This is fully compatible with the romance hypothesis, but it isn’t compatible with just arbitrary mechanisms, such as autogynephiles coming to the beliefs that manhood is bad, or leading to obsession over whether one is trans. Those are just ad-hoc theories made up to strengthen the AGP → GD case, not genuinely supported by the evidence.3

Further defense of the romance hypothesis

So Blanchardianism relies a lot on the romance hypothesis. But isn’t a too wacky? After all, it’s not like we hear people being all emotional about their kinks in general?

This is wrong, as far as I can tell:

Poll I did on /r/fetish about emotional elements to kinks. Notice that most did not report the kinks as purely being about horny. I have edited the ordering of the response options in the image to show in order from most sexual to most emotional, rather than the most-endorsed order that StrawPoll by default uses.

In this poll, I didn’t ask for details, so I don’t have any specific stories for them. This is something that should be looked into in more detail in the future. Anecdotally there seem to be lots of stories floating around about it; e.g. submissives who feel “comfortable”, “destressed” when controlled; e.g. women with pregnancy kinks who “feel like a goddess”.

Would people really live their life in a kink? Yes. Polyamorous people radically change the standard family structure, and there appears to be something sexual behind them too. Part of the problem is that people make it out to be this big deal, bizarre perversion. Some straight people get married, some autogynephiles transition, what’s the big deal?

The overall phenomenon should be studied more, but “this has not been studied enough” is not the same as “this doesn’t exist”; if we had no reason to think it would exist, then it would be silly to bring up, but it makes perfect sense in the context of how sexuality usually works that paraphilic sexual interests would also lead to all sorts of further attachments.


In my discord servers, I discussed whether to write this retraction ahead of time, and people had some responses. I thought it would be relevant to address concerns along those lines here:

Q. I’m autogynephilic, and I genuinely do have a sort of antipathy towards male things that has grown over time. It seems like this would contribute to gender issues. Aren’t you rejecting this possibility too quickly?
A. It’s not that these can’t exist; it’s that if they do exist, they raise questions about the validity of the AGP model. How do you know that this antipathy towards male things arises from autogynephilia, rather than just having some unknown common cause? Confounding should be the default assumption. If they do have a common cause, then the correlation between autogynephilia and gender dysphoria overestimates the effect of autogynephilia on gender dysphoria, making the typology invalid.

Q. Isn’t this overly dramatic? You could just do an edit of your original article to remove the bad argument, there’s no need to retract it.
A. Really it’s only a partial retraction. I’m leaving up the original blog post, with an explanation of what exactly I do or do not stand by. However, I think this point is rather central to the disagreement, and a core part of my original argument, so I think this has left open a big flaw in the blog post that justifies serious action.

Q. You had a lot of qualifiers though, saying that it was just speculation. Is it really that big of a flaw when you pointed out the inherent uncertainty?
A. The qualifiers were on the individual hypotheses I listed. The implication was that there were a large number of speculative hypotheses, where each might be unlikely, but which in aggregate were likely. I fully relied on the “in aggregate likely” aspect in it, saying “However, the point is that the options aren’t just romance hypothesis, purely sexual, or no typology.”. As far as I can tell, that’s just wrong; if it’s not purely horny, and the romance hypothesis is completely invalid, then the whole AGP argument is dead, with the skeptics having won. My claim was just wrong, despite my qualifiers.

Q. Isn’t this going to make it easier for people to dismiss your blog post, when they can just point out that it’s retracted?
A. If people act in bad faith, they will find some reason to dismiss it regardless. If people act in good faith, they will notice the partial retraction, as well as the explanation, and take that into account when evaluating the post. Pointing out the flaw in big letters probably reduces the rhetorical impact compared to just silently fixing the post, but it is absolutely inappropriate to optimize for rhetorical effectiveness over accurately informing the reader, especially if you’ve already made a huge error in reasoning. Let’s clean up problems on our own domain (there are many) before we start worrying about convincing skeptics.

1. A minor footnote: one should actually beware about “overexplaining” here. Most autogynephiles do not transition and do not feel gender dysphoria; so there must be ???something??? beyond autogynephilia that distinguishes trans women from autogynephilic cis men. This might be something entirely independent of autogynephilia; it might be some nuance of autogynephilia (degree of autogynephilia surely contributes; maybe kind of autogynephilia also does?); it’s most likely a combination of factors. If someone claims to explain it entirely through autogynephilia, they are a quack.

2. One of the possibilities I brought up was conditioning. I remain skeptical of conditioning-based theories, but at least conditioning is justified in the sense that it genuinely does provide a good story for why it is mediating the effect of autogynephilia on gender dysphoria. This might seem to justify it more, except what’s really going on is that conditioning is not wholly separate from the romance hypothesis; conditioning would be a mechanism through which the romance hypothesis might operate. So really we should only consider the other hypotheses than conditioning, and consider me mentioning conditioning on the list as another mistake.

3. So how did I come up with this? Was I just dishonest? I think my error was that there is some reason to think that autogynephilia contributes to gender issues without understanding the mechanism, namely the typology. You apparently observe a negative correlation between autogynephilia and femininity in trans women, but you do not observe this in the general population. This is the pattern of correlations we would expect under a collider; if we had autogynephilia → transgender ← femininity. This is a perfectly valid argument that autogynephilia causes gender issues, that works independently of any mechanism; the issue is just that the argument is not very strong; it can easily turn out to be wrong if there is something more complicated going on. In fact, its lack of strength is directly connected to its mechanism-independence; without referring to the mechanism, any argument will just be an indirect proxy argument, and so therefore weak.