Autism as a sparsity prior

[Epistemic status: Speculative. Testing these ideas quantitatively seems hard.]
[Writing status: I have no idea whether this is even halfway comprehensible. I’m so deep stuck in decision theory and causal inference and psychiatry that I’m not sure what is obvious to others, and what is in need of further explanation.]

I’m autistic, but I’m also kind of confused about what autism even is. Apparently in some contexts I come off as being very autistic, while in other contexts I don’t come off as autistic at all. However, I’ve got an idea that helps me make sense of some things:

Autism is supposedly “characterized by difficulties with social interaction and communication, and by restricted and repetitive behavior”, and tends to involve rigidity and obsession. A common approach seems to place it as one end of a spectrum of mechanistic vs mentalistic thinking, with schizophrenia at the other end; this approach is summarized by Scott Alexander in his blog post on the Diametrical Model Of Autism And Schizophrenia.

I’m building on that model, but I have some thoughts or concerns about it. It relies on the idea of “mechanistic” thinking (or in related models, “systematizing”, “if-then-else” thinking). But what is that? And why would it “trade off” against mentalistic thinking? (I think I have a reasonably good understanding of what mentalistic thinking is, namely as an agency prior. More on that later.) This is where my idea comes in.

Mechanistic thinking as a sparsity prior…

… would probably have been a more accurate title to the blog post, but that wouldn’t make the connection to autism as clear. To understand what this means, we need to consider what a prior is more generally, and what a sparsity prior is specifically.

“Prior” is short-hand for “prior distribution”; it’s a mathematical object used in Bayesian statistics which describes the theory in which one interprets the data one encounters. It turns out, you can’t interpret data without theory, no matter what you do; even something as simple as extrapolating from the past into the future relies on the theory that the past will resemble the future. The prior formalizes what exactly the assumptions you make are.

A sparsity prior is, in a sense, a formalization of the law of parsimony. Sparsity priors assert that it is more likely for a system to work in a simple way than in a complex way. The overwhelming majority of ways a system can work has “everything interacting with everything” (well, rather, most things interacting with most things); sparsity priors assert that this is implausible because there are so many interactions, and so rules out the overwhelming majority of possible explanations.

To see how this relates to mechanistic thinking, let’s consider some examples:

  • “If-then-else” thinking involves picking some key factor (the “if”) and making decisions on the basis of this. This makes sense only if there is some key factor that matters much more than anything else, i.e. if the system is sparse.
  • We tend to think of mechanistic systems as inanimate. What does that mean? One element of inanimacy is that they are not proactive; they don’t go out and modify the world in all sorts of ways. They just sit there. This is sparsity; they don’t have much influence on anything else.
  • Another element of inanimacy is that they are not reactive. They only have a limited, fixed set of ways to interact. Yes, a rock can be thrown, and it can fall to the ground, but it can’t really do much beyond that. That is, inanimate systems cannot be affected in subtle and complicated ways, only in simple and well-defined ones, which is a sort of sparsity.
  • Mechanistic thinking involves a sort of decontextualization; it tends to assume that things will work the same each time. This partly relies on sparsity, in that it assumes there aren’t a number of additional moderating variables that changes the system’s behavior. (It also relies on symmetry, in the sense that it requires duplication of the system across different contexts, such as over time or over space. Symmetry could likely also be considered a prior, but it is not the same prior as a sparsity prior.)

Generally, the pattern is that a sparse prior is rigid. It assumes there is one simple explanation going on, and cannot deal well with exceptions. It fails badly in cases that are not sparse. One such case is when dealing with interpersonal interactions, and agency in general. Agents try to absorb as much information as possible from their surroundings (e.g. by looking around and seeing things), building a model of what is going on, and they try to massively modify the world to suit them (e.g. building cities, reproducing to have a massive population). These are very non-sparse dynamics, as they require massive amounts of causal exchange, both in and out, in order to regulate the world effectively. Ultimately, any prior only works as well as the environment corresponds to its assumptions, and agents don’t correspond great to a sparsity prior, making it malfunction badly on them.

Relationship to the diametrical model

I think that’s a reasonable starting point; it seems like a sparsity prior accounts for the rigidity involved in mechanistic thinking well. So that leads to the followup question of, if autistic people have a sparsity prior, do allistic people have a density prior? And the answer is no: Most possible models are dense. Therefore, assuming the world is dense does not meaningfully narrow down the possibilities enough to let you infer anything about how it works.

As I see it, for the diametrical model, the relevant alternative to a sparsity prior is an agency prior. Under an agency prior, you assume that the world contains individuals who have goals, beliefs, and who act according to these to influence the world.

To illustrate the contrast, suppose you come across a tree that is fallen over. You don’t know how it has fallen over; the main possibility you can think of is the wind, but it seems hard to imagine that the wind is strong enough to knock down a tree. Density priors, sparsity priors, and agency priors interpret this scenario in three different ways:

  • If you have a density prior, anything might explain it. Maybe an elaborate Rube Goldberg mechanism knocked it over. Maybe god did. Maybe it’s not knocked over but it’s just an optical illusion. Maybe it grew fallen down. How could you know? After all, anything is possible in this world.
  • If you have a sparsity prior, it’s pretty unlikely that it could be anything other than the wind. After all, that would require some new effect that you hadn’t considered to explain it; but that seems pretty unlikely. The world is simple and doesn’t have all sorts of complicated effects. It must have been a very strong wind, or a very weak tree, or some combination of the two.
  • If you have an agency prior, you still don’t know how it happened. But you do know one thing: Someone must have wanted it to happen. After all, trees aren’t usually fallen over; such a special scenario requires an explanation, and the obvious explanation is that someone did it because they wanted to.

A pure agency prior seems reminiscent of schizophrenia, especially explaining the conspiratorial aspects, as well as the assumption that there is a deeper meaning behind every little thing one encounters. One could imagine that normal people have a balance between agency and sparsity priors, while autistic people have a skew towards applying sparsity priors, and schizotypal people have a skew towards applying agency priors. Seeing autism as a sparsity prior is thus not an explanation of how autism differs from allism (that would be via lack of agency prior), but instead an explanation of autism on its own terms, without reference to social cognition.

Deriving autism from sparsity priors

I will now go through some symptoms I see online characterizing autism, and discuss how they related to sparsity priors.

Autistic people are said to do worse at social problems. To an extent, this probably follows simply from being different; anyone who is sufficiently different is going to have trouble dealing with others. But if autism is a dysfunction in agency priors that leads to application of sparsity priors in cases where they are ineffective, then that too should lead to social difficulties.

Humans are agentic. One characteristic of agency is trying to do everything possible to optimize one’s goals; in communication, this would involve trying to optimize every part of one’s message, from phrasing to body language. If this is interpreted from an agentic perspective, then one will ask “given the whole picture, what is this person trying to tell us?”; on the other hand, if it is interpreted from a sparse perspective, then one will focus on the clearest specific things, likely ending up overly literal, and ignoring context and subtle signals. More generally, it will not even occur to someone with a sparsity prior that there are subtle things that they are missing; after all, that is part of the sparsity prior.

This also goes the other way; if you have a sparsity prior, then you will assume that there are not many things that are relevant for how to optimize your message. You may end up blunt, focusing purely on the message, and not taking into account the side effects that sharing the message may have. Meanwhile, with an agency prior, you would to a greater extent take into account the social implications.

Autistic people are characterized by obsessions and restricted interests. This might also make sense in terms of a sparsity prior; if you determine that there is some factor that is important, then it makes sense to learn everything you can about that factor, as it likely accounts for a big fraction of everything that might ever be important about anything (by the assumption of sparsity, that there are not that many factors that influence things). On the other hand, if someone else is interested in something, then an agency prior would tend to infer that this other thing must also be important and worth looking into; and thus lead to broader interests.

There are some autistic characteristics where I don’t quite understand what they refer to; for instance ritualistic behavior. Certainly it seems like sparsity should imply some forms of ritualistic behavior; due to the assumptions that there are only a few key ways that things work, a sparsity assumption would imply that one could end up focusing excessively on some parameters when making decisions, seemingly “ritualistically” ignoring alternative possibilities. I can sometimes recognize this from my own decisions, where in retrospect I have ignored many factors in favor of a single clearer factor. So perhaps sparsity can explain autistic ritualistic behavior too.

One needs to be careful here, with the point mentioned in the previous paragraph. Allistic people also have a sparsity prior, because you need a prior to act. However, I think it comes down to the social element; if there is even the slightest social encouragement to do something in a different way, then an agency prior will assume that there is some good reason for that, and adjust. This will still lead to ritualistic behavior in cases where everyone socially agrees on it, but since everyone is used to this, it is not noticeable. (Unless you start doing anthropology, at which point you realize that humans are very ritualistic in general.)

This ends up important to take into account when one then starts analyzing other factors. For instance, consider sensory overload. It would make sense that if you assume that only a few key factors are important, you would avoid “noisy” (not just audibly but also visually and through other senses) places, due to being unable to figure out what those factors are when there is so much noise. But this shouldn’t be limited to autism, it should apply to allistic people too. I don’t know whether there is a social explanation here, like there is for ritualistic behavior.

Another characteristic with similar problems is that autistic people tend to be clumsy. I think clumsiness can easily be tied into sparsity. Per Moravec’s paradox, basic things like moving your body are much more difficult than humans usually observe, due to needing to constantly optimizing your movements and keep track of every little thing. It would make sense that sparsity would tend to lead to “robotic” and clumsy movements in such a case – but since everybody, not just autistic people, need a sparsity prior to make sense of everything, it’s hard to see how this explains why autistic people specifically are clumsy. And here I also have trouble coming up with a social explanation.

One thing I’m still not sure how to derive is stimming; making repetitive movements. I only have a relatively limited form of stimming, consisting of tapping my feet sometimes. I don’t see any way this fits into a sparsity prior. Maybe there’s a subtle thing related to neuroscience or gathering information or something, but for now I will just consider this to be unaccounted for in the theory, indicating that there is a flaw.

I’m also not sure how to derive comorbidities like ADHD.

Dimensions of sparsity prior

Autistic people aren’t entirely lacking an agency prior, and allistic people aren’t lacking a sparsity prior. This raises the question, in what sense exactly can autistic people be said to skew towards a sparsity prior? More generally, why do the priors trade off against each other?

Let’s start with the second question. It might seem mysterious that mechanistic and mentalistic thinking have a tradeoff. From the point of view of priors, the answer is that they don’t really have very much tradeoff. There are some limits to it, in the sense that you only have so much probability mass to give out, so eventually you do run into a tradeoff. However, it seems like under ordinary circumstances, one would quickly figure out which model is appropriate, and apply that.

The apparent tradeoff probably arises from the fact that you need some sort of model. “The density prior” isn’t useful, as it lacks predictions; so there is some relatively limited set of models, of which sparsity and agency are two of the most obvious ones. The things you don’t model mechanistically must be modelled in some other way, and this main other way would be mentally. Hence an apparent tradeoff.

When it comes to the question of in what sense autistic people skew towards sparsity, I think the core of it is just that the agency prior among autistic people works worse, so they use it less. I.e., autism is literally the same thing as cognitive difficulties with social things (low “emotional intelligence”, except plausibly emotional intelligence tests aren’t good at measuring the agency prior). If you struggle with modelling things in agentic ways, you will end up modelling them in mechanistic ways instead, regardless of how appropriate that is. But it’s worth noting that quality of prior isn’t the only “free parameter” to consider:

A sparsity prior generally has a free parameter describing the degree of sparsity it assumes. You might think of it as being akin to a dimensionality measure. If you have a bunch of dominoes standing in a line, they can affect their neighbors, such that one of them falling over can influence the entire line, transitively. Meanwhile, if the dominoes are more spread out, then you won’t get these sorts of chain reactions, but will instead see the effects dissipate. So on one end of the spectrum, a limiting case of a sparsity prior is the assumption that everything might affect everything, while on the other end of the spectrum, a limiting case is that everything is isolated and can’t affect anything else.

Finally, one free parameter is the “hyperparameter” that tells you how often a sparsity prior is appropriate to use, compared to any other prior you might have. This is essentially about whether your immediate instinct is to think of things in a mechanistic way, or in an agentic way (or in some other way, assuming there are other priors of interest). This should in principle only have a limited effect on things; one should quickly be able to recognize when a situation calls for something else, other than a sparsity prior. Possibly, this might explain why engineers can seem more autistic; their “default mode of operation” ends up being autism-like, but they can adjust if they get a bit of time.

I don’t know if all of these exist in the brain. (Heck, I don’t know if any one specific of them exist in the brain; all of this is speculation.) But they seem a lot more concrete to me than “if-then-else thinking”, “systematizing” or “mechanistic thinking”.

Relationship between autism and masculinity

One theory of autism that has come up is the Simon Baron-Cohen’s “Extreme Male Brain” theory, which asserts that men think more “systemizingly” (basically mechanistically) and women think more empathetically, and that autism is due to a skew towards the male side of things. As support for that, it’s generally noticed that more males are diagnosed as autistic, that autistic people have a more male systemizing-empathizing profile, and so on.

In the past, I have had trouble with this theory. One element is that SBC’s measures don’t reaaallly seem to overlap all that much with autism. And autistic men don’t seem to be all that masculine. Really, this sort of thing has also made me skeptical about the diametrical model in the past, and about the validity of “autism” as a category in general. But now I’m writing a huge blog post on it, so clearly I’m going to have some new thoughts on this:

I think men are interested in things and women are interested in people, while autistic people are bad at people and schizotypal people are bad at things. But experience trades off against priors; getting more experience can to some degree develop skills that compensates for lack of ability. So if you’re autistic, but you are interested in people, then you are much more likely to learn to compensate, and that leads to female autists being less likely to be diagnosed. Unless they happen to have masculine interests, in which case they don’t develop their people skills. This seems to match what some people say about female autists “masking” their autism to make it less visible.

This model does introduce an ambiguity; do we define autism by the prior (possibly leading to equally many male and female autists?), or by its consequences? Probably the answer here should be that we shouldn’t get too hasty in applying speculative theoretical models, and that we should wait with using the prior as a definition until we have actually validated that this is an accurate theory of autism. Which it might very much not be.


For a while, autism hasn’t really made sense to me. This model makes it make a lot more sense to me, though I have no clue whether it’s right; I would encourage anyone to critique it. The model essentially just boils down to “autistic people are bad at social stuff” though, which is obvious enough; I guess the nonobvious claim of the model is about what is left in reasoning after removing the social stuff. And I don’t really think sparsity priors are the only thing left over; rather, I think they (or something like them) can be seen as a more formal way of understanding what “mechanistic thinking” is. But mechanistic vs mentalistic are not the only forms of thinking, and so there’s lots of things left over.

One big bottleneck for further development I see is a need to create something that measures it. This would probably be a form of intelligence test. I think current “emotional intelligence” tests don’t really focus much on ability to reason about agency vs sparseness per se, but instead on surface-level “content” vaguely associated with these things. Possibly problems akin to the “tree that has fallen over” situation that I described before might be relevant.

It might also be entertaining to ask whether other things can be mapped into this sort of “prior” viewpoint. For instance, there’s the concept of some person’s or place’s “vibe”; this seems non-sparse, but also not necessarily agentic. Certainly in some cases it might be agentic, but it more generally it just seems similar to principal components regression. Roughly speaking, one can understand the assumption behind this as being a combination of sparsity with latent variables; that is, rather than assuming that everything is observable, one assumes that there’s some important unknown unobserved variables that influence many things; these are then estimated using the “vibe”.

I also can’t help but notice that I’m very interested in mechanistic-like theories of agency. Artificial intelligence, decision theory, psychology, and so on. I wonder if learning such theories to a sufficiently advanced degree can function as a sort of self-treatment of autism. 馃し

Controlling for the general factor of paraphilia

Almost all sexual interests are positively correlated. Going even further, almost all paraphilias are positively correlated, and typically more closely with each other than with normophilic sexual interests. To give an example, from a survey run by the creator of /r/AskAGP:

Correlation matrix from a select set of items from the survey. Each cell shows the degree of association between two sexual interests. For a visual intuition of what the quantities in the cells mean, consider using this correlation visualizer.

When trying to understand the structure of sexual interests, this presents a problem, because an important method of understanding it proceeds by looking at the pattern of correlations. But when everything correlates with everything, it is difficult to interpret this pattern.

The pervasive correlations could be interpreted as representing a “general factor of paraphilia”; that is, some underlying factor that contributes to all abnormal sexual interests. Since paraphilias and ordinary sexual interests also to a degree correlate with each other, one might interpret it as to some degree reflecting something that paraphilias and ordinary sexual interests have in common, such as libido. However, since paraphilias are more strongly correlated with each other than with ordinary sexual interests, it likely also represents something else other than libido. One possibility for this something else might be that there is some common error or set of errors in the development of one’s sexuality that can contribute to all abnormal sexual interests. Another possibility is that it represents “method factors”; i.e. maybe to a degree it is due to people who are open to admitting to one abnormal sexual interest also being more open to admitting to other abnormal sexual interests.

But regardless of the reason for the presence of this general factor of paraphilia, it would be nice to have some way to control for it. One way of controlling for it is to assume that each paraphilia is influenced to some specific degree by the general factor; this influence is called the factor loading of the paraphilia. We can then fit a statistical model that estimates the factor loadings of the paraphilias, and use this model to distinguish between the variance and correlations due to the general factor, versus the variance unrelated to the general factor:

Decomposition of the correlation matrix into aspects that come from the general factor, and aspects independent of the general factor. The mathematical details for how I computed this decomposition are available at the end of the post.

In order to understand things better, we can zoom further into the residual matrix. This matrix represents the correlation structure after taking the general factor into account:

Residual matrix. The diagonal shows the fraction of variance in each sexual interest that is independent of the general factor. The off-diagonal shows the correlations of sexual interests above and beyond that due to the general factor.

This kind of matrix should give a clearer idea of the true structure of the paraphilic interests. For instance, we can see that being a furry is correlated with attraction to animals, as well as with AGP; this is predicted by the theory of erotic target location errors, which states that AGP and furryism both represent an “inversion” of attraction to respectively women and anthropomorphic animals onto oneself, such that one is interested being what one would otherwise be attracted to.

On the other hand, we do not see any particular correlation between autogynephilia and masochism. We do see one between forced feminization and masochism, but this is trivial due to content overlap. (Unfortunately, this dataset didn’t include anything asking about transvestic fetishism independent of masochism.) The lack of correlation between autogynephilia and masochism seems to contradict the theory of masochistic emasculation fetish, which asserts that autogynephilia isn’t really a sexual interest in being a woman per se, but instead a fundamentally masochistic interest. Since anecdotal correlations between autogynephilia and masochism are the main thing cited as evidence for MEF, I think this result disproves MEF, at least unless it turns out to have been a fluke somehow. (Incidentally, the creator of /r/AskAGP collected this dataset specifically with the purpose of trying to prove that AGP was correlated with masochism. Turns out it’s not that simple.)

Another interesting thing is that this method makes it more convenient to collect items that measure the general factor of paraphilia well. The issue with just picking any random set of items is that if the items have too much residual correlation, they might not measure the general factor of paraphilia, but might instead measure something more narrow. For instance, if one includes both “Having your partner say insults and/or slurs to you” and “Being humiliated for having a small penis”, one might end up just measuring humiliation masochism. Thus, ideally one would pick items that have no correlations beyond the general factor of paraphilia at all. Inspecting the diagram, that following look like a promising set:

  • Imagining being a woman and caressing your own (female) body
  • Having your partner say insults and/or slurs to you
  • Exposing your genitals to an unsuspecting stranger
  • Sniffing your partner’s underwear
  • Mate-swapping; having sex with someone else’s partner while they have sex with your partner
  • Tying someone up
  • Having sex with someone much older than yourself

Since these items aren’t pure measures of the general factor, they are not going to perfectly measure it, even when aggregated. It might be nice to have an idea of how well they measure it. One such measure is the internal reliability, which estimates the correlation between the general factor and the sum of the items from the internal correlations between the items. The internal reliability of this set of items is 0.64, which is considered to be on the low side; ideally one would find a greater set of varied paraphilia items to create a better measure of the general factor of paraphilia.

I’ll end this post with a brief technical description of the math involved. In order to fit the general factor, I searched for a vector 位 containing the factor loadings, as well as a matrix 惟 containing the residuals. Given the correlation matrix 危, I then searched for solutions to 位 and 惟 such that 危=位位T+惟.

This is an underspecified problem, as one can find a solution for any 位 simply by taking 惟=危-位位T. To make the result well-defined, I picked 位 so as to minimize the off-diagonal elements of 惟. The intuition behind this is that correlations between unrelated paraphilias are presumably due to the general factor, so we do not want 惟 to contain any of these correlations. In order to prevent 惟 from containing this, we simply minimize the off-diagonal elements. More specifically, I chose to minimize the sum of the absolute values of 惟; this should aim to set the median value of 惟 to zero. As long as most of the paraphilias collected are unrelated to each other, this should accurately get at the general factor of paraphilia.

In order to estimate the internal reliability, I used the formula (位0+位1+…)2/((位0+位1+…)2+(1-位02)+(1-位12)+…). Intuitively speaking, the formula consists of two parts, G=(位0+位1+…)2, and S=(1-位02)+(1-位12)+…, such that the total formula is G/(G+S). G and S each represent a fraction of the variance in the “sum score” of the paraphilic interests. Specifically, G represents the variance due to the general factor, while S represents the specific variance for each paraphilia (which, when trying to measure the general factor, we think of as being measurement error). The G variance grows quadratically with the number of items, while the S variance grow linearly with the number of items, so this means that as one increases the number of items, the sum score will to a greater degree represent the general factor compared to the specific variance; that is, more items means less measurement error.

The mean factor loading for the restricted set was 0.45. If additional sexual interests continue having factor loadings like this, the reliability should become adequate with about 10 items, good with about 16 items, and near-perfect with about 40 items. One project I would like to see completed would be to collect more items to better measure the general factor of paraphilia.

Causality is essential: Reply to MTSW on Autogynephilia

About four years ago, Mark Taylor Saotome Westlake published “Reply to Ozymandias on Autogynephilia“, responding to Ozy’s “On Autogynephilia” blog post. In it, he argues that it is solely the correlation structure between surface-level observations that is relevant in science and creating typologies. Needless to say given my recent posts, I disagree.

Sexual arousal to fantasies of being a woman is usually thought of as reflecting a sexual interest in being a woman; whose cause is unrelated to a desire to be a woman, but which has the consequence of generating a desire to be a woman, through means analogous to how other sexual interests work. (See Is autogynephilia real? The phenomenon, the construct, the theory for more details.)

Ozy questions this conceptualization. The specifics of this questioning is unclear, as they do not lay out much detail in it, or much evidence for it. If I had to attempt to parse it, it would seem that Ozy is distinguishing “sexual interests” into “fetishes” and “attractions”, such that e.g. ordinary heterosexuality would be an “attraction”, but “true autogynephilia” would be a “fetish”, and where fetishes only lead to desires for actualization within narrow sexual contexts. In addition to “true autogynephilia” they then claim that arousal to fantasies about being feminized can be a manifestation of gender issues through a variety of mechanisms. I obviously doubt these ideas, but this blog post is not for criticizing them.

Rather, it’s for criticizing MTSW’s response. He argues:

In what way聽are those conceptually different things? You’re describing a.m.a.b. people engaging in what at least superficially聽seems聽like the same behavior, jacking off to the same porn and having the same fantasies. For the ones who might consider transitioning, you say that the erotic behavior “may be a manifestation of gender dysphoria” although it’s “unclear […] how exactly the link […] happens.” For the others, it’s not a manifestation of anything in particular. It’s certainly possible that autogynephilic arousal in pre-trans women and non-dysphoric men are two completely different things that happen to involve common elements (much like how MtF transsexuality itself is two completely different things that happen to involve common elements!). But what’s the specific evidence?

The answer to how they are different things is straightforward enough. Ozy’s model seems to go something like this:

My guess as to Ozy’s model.

As you can see, true autogynephilia and sexualized gender dissatisfaction exist as different nodes in the causal graph, and therefore in Ozy’s model they are different things. Easy peasy.

Now, certainly I agree with MTSW that to support this specific model, one needs some evidence, and Ozy doesn’t seem to provide that. However, without more specific evidence, the alternate hypothesis isn’t that sexual feminization fantasies reflect the single unitary concept that I described in the beginning. Instead, the baseline approach could just as well be to consider them unvalid: that it is unknown what exactly they reflect1. In fact, the main reason I don’t consider them unvalid is not because there is a lack of evidence of their invalidity, but instead because I believe we have evidence otherwise that indicates that they are a valid indicator of something akin to autogynephilia proper. Specifically, we have an understanding of how sexuality in general works and what function it serves which suggests that they would indicate a sexual interest in being a woman.

But this is absolutely not a generally accepted understanding! It seems to me that it is quite common for people to come up with elaborate stories of sexuality as coping mechanisms, reflections of hidden desires, taboos, curiosity, etc.. I have not seen any particularly convincing arguments for why they should be so. But at the same time I can’t recall seeing any particularly convincing arguments for my preferred alternative hypothesis, that sexuality simply reflects sexual desires. Certainly, MTSW’s blog post doesn’t include them. Rather, these are arguments I have had to construct on my own.

Ozy also dismisses any apparent typology as merely correlational, which MTSW takes issue with:

“May or may not be correlated”?! That’s all you have to say?! Summarizing correlations is the聽entire point聽of making a taxonomy. Yes, psychology is complicated and people are individuals; no one is going to fit any clinical-profile stereotype聽exactly. But if we聽have studies聽that find correlations (not with correlation coefficients聽equal to one, but correlations nonetheless) between sexual orientation, age of transition, childhood femininity, and history of erotic cross-dressing鈥攊f, sheerly intuitively and anecdotally with no pretense of rigor, it聽seems plausible聽that the Laverne Cox/Janet Mock/Sylvia Rivera cluster of people is a distinct thing from the Julia Serano/Deirdre McCloskey/Caitlyn Jenner cluster of people鈥攊s it聽really that bad聽for someone to speculate, “Hey, maybe these are actually two and only two different psychological conditions with different etiologies”?

Like, maybe it’s not true. Maybe there’s some other, more detailed and expansive model that makes better predictions. But what is it,聽specifically? What’s your alternative story?

This response seems to engage in some motte/bailey’ing. It starts out arguing “summarizing correlations is the entire point of making a taxonomy”, but ends up at “maybe these are actually two and only two different psychological conditions with different etiologies”.

Certainly there are some contexts where correlations are useful – e.g. making predictions. But correlations have a lot of problems. There’s no guarantee that they are stable across time, or across contexts. There’s no guarantee that they will persist after applying pressure to them. There’s no guarantee that they tell us anything about reality. And indeed this is probably why it took less than a paragraph for MTSW to switch from talking about correlations to talking about etiologies (a causal concept).

There is an absolutely massive number of potential causal models that can fit to any given dataset. It’s probably fair enough to expect people to give an example of some factor, or factors, which could also account for observations. But at the same time, unless you give a strong reason to believe the model, you can’t expect people to buy it.

Personally, I find the actual causality important because I am researching how gender issues work, and this is deeply dependent on understanding the causality. If autogynephilia causes gender issues, then to understand gender issues, I can look for moderators or mediators of the effect of autogynephilia on gender issues, and I can meaningfully control for autogynephilia when looking at other potential causes of gender issues. Plus merely identifying autogynephilia as a cause is real progress in understanding. On the other hand, if autogynephilia is caused by gender issues, then identifying autogynephilia seems like only a curiosity, unless it turns out to be useful for some subtle reason. (Identifying repressors? Dubious.)

Others might have other priorities. One priority MTSW has is the prediction that AGPTSs are not going to be as female-typical as HSTSs. On the surface level, this might not seem to be relying as much on causality and correlations. However, as mentioned before, the idea that this is going to be a stable phenomenon does rely on causality.

Even if one has some context where causation isn’t relevant, claims of causation represent burdensome details. Certainly if one is interested in information compression, one can argue that trans women behave as if a certain causal story holds – though one should be careful about making sure that they actually do this; it can be nonobvious what a theory actually predicts, and summarizing the actual phenomena may be more accurate than trying to come up with a corresponding theory.

Finally, MTSW argues:

But here’s the thing: you聽can’t聽mislead the general public without thereby also misleading the next generation of trans-spectrum people. So when a mildly gender-dysphoric boy spends聽ten years聽assuming that his gender problems can’t possibly be in the same taxon as actual trans women, because the autogynephilia tag seems to fit him perfectly and everyone seems to think that the “Blanchard-Bailey theory of autogynephilia” is “clearly untrue”, he might feel a聽little bit betrayed聽when it turns out that it’s聽not聽clearly untrue and that the transgender community at large has been systematically lying to him, or, worse, is so systematically delusional that they might as well have been lying. In fact, he might be so upset as to be motivated to start an entire pseudonymous blog dedicated to dismantling your shitty epistemology!

Certainly, one reason that the BBL typology might be useful is that some males who are aroused by the thought of being a woman should transition, and this typology gives them an explanation of why.

But so does Ozy’s proposed theory of autogynephilia sometimes being a manifestation of gender dysphoria, and sometimes being “true autogynephilia”! And this is quite a popular theory that the trans community often gives as advice to those who are questioning.

In theory, perhaps, it might be that BBL can help with gender questioning more effectively. Certainly it seems like it should be able to break down the “am I trans or is it just a fetish?” question. But Ozy’s framework also sort of addresses this, by identifying the question with whether one wants to live as a woman in everyday life outside of sex. Is that a good solution? Probably not, the truth is most likely a better solution. But the choice isn’t between “autogynephilia has nothing to do with transsexuality” and “BBL is true”, there’s a whole universe of alternative options out there.

  1. As a footnote, in order for autogynephilic fantasies to be a valid measure of whether one is an autogynephile, one needs to have a definition of what autogynephilia is. While Blanchard aimed to produce such a definition (“propensity to be sexually aroused by the thought of oneself as a woman”, “love of oneself as a woman”), this definition was too vague to actually be useful. As such, autogynephilia is just reduced to a subjective judgement synthesized from fantasies and arousal patterns. This vagueness do in fact make it hard to test objectively whether someone is autogynephilic or not, which runs into the issue Ozy brought up about self-report-based theories having approaching ill-definedness once one starts bringing up lying. Ultimately lying is a thing, but Blanchardians have been ignoring the construct validation of autogynephilia for too long. And we will probably continue to ignore it unless I get it done.

Reflections on a failed attempt at testing the causal relationship between autogynephilia and gender issues

I’m really deep off a tangent, so I’m not sure this post will be understandable to anyone but me. 馃槄

I am as always trying to understand the causes of affective gender identity, and for this in males there is one thing that will always remain striking: autogynephilia, i.e. a sexual interest in being female, is highly correlated with wanting to be a woman, to the point of being one of the factors that is most strongly associated with gender issues.

And this is, as always, associated with the big question: What’s the causal relationship here? The correlation is pretty irrelevant if autogynephilia is simply a symptom of wanting to be a woman, as the goal is to understand the causes rather than effects of gender issues. But how do we know?

As a Blanchardian, I believe autogynephilia causes wanting to be a woman. An argument Blanchardians often make is that we can observe this from autogynephilia preceding gender dysphoria, but I think this is a terrible argument, for two reasons:

  1. While autogynephilia might usually precede full-blown gender dysphoria, it appears simply wrong that overt autogynephilic arousal always precedes all of the cross-gender ideation it is associated with. Blanchardians often explain this via proto-sexual ideations, analogous to childhood crushes; and I accept that explanation, but this implies that there are dynamics in play before the temporal events that the argument relies on, and therefore the argument is fundamentally flawed.
  2. More generally, you can’t in general know that such events wouldn’t be there; observing two correlated events separated in time is not enough to tell you that the first causes the second, as there might instead be confounding.

So if this argument is flawed, why do I believe autogynephilia causes gender issues? Ultimately I think it has to come down to: We know what sexuality is, what it does. The entire point of your sexuality is to motivate you; it’s not particularly plausible that it is caused by or confounded with gender issues, compared to it causing gender issues.

As in, suppose you forget everything we know about sexuality and wanting things. Instead, we just observe a correlation between Thing A associated with women, and Thing B associated with women. Even if Thing A comes before Thing B, is there much reason to suppose that Thing A causes Thing B? No! Of course not! Correlation != causation. But conversely, if we forget everything we know about women, and just observe a correlation between a sexual interest in Thing C and a desire for Thing C, is it then reasonable to think that the sexual interest causes the desire? Absolutely! Because that’s what sexuality does.

… except, how do I know that? For this, I’d bring up several arguments. It makes obvious sense evolutionarily speaking. It’s how we usually think of sexual interests like attraction to men or attraction to women. It’s not like we find it easy to control our sexuality – indeed, autogynephiles who dislike being autogynephilic and who don’t want to be women would seem to contradict the notion that it’s merely a question of sexuality reflecting desires – but this is fully compatible with a sexual model, as they resemble ego-dystonic gay men.

But sexuality being this deeply malleable thing that reflects our deepest desires does seem to reflect how critics of Blanchardianism see things. In fact, it seems rather central to their critique. Sometimes it can even look pretty plausible to me – wouldn’t we expect a gender dysphoric male to imagine being female in ordinary sexual fantasies? And maybe autogynephilia is somehow different from other sexual interests? Or maybe the understanding of other sexual interests is mistaken?

I believe I have come up with an approach for investigating these sorts of questions, and I ended up getting impatient and testing this approach, but it turned out not to work. I don’t think the failure is fundamental to the approach, but instead due to the low quality of measurement I did, so I think it would be worth talking about the method.

Psychological systems

How can I even talk about “what sexuality does”? I’m taking things that happen for some sexual interests, such as gynephilia, and generalizing them to entirely different sexual interests, such as autogynephilia. How can that make any sort of sense?

It makes sense only if we suppose that they both represent variations in some common underlying system – in this case, sexual preferences. That is, I suppose there must be some innate system that I call “sexuality”, which contains certain preferences and motivates people to act on them. If I didn’t believe that – if I believed that sexuality was just a mishmash of different things lumped together under the same label – then it would make no sense to believe that I could generalize from gynephilia (or other sexual interests) to autogynephilia.

So let’s consider this sexual system more specifically. I believe that it contains some latent sexual preference. I further believe that upon encountering instances or fantasies of this sexual preference, one tends to experience genital arousal (though this is moderated by contextual factors like state libido) in proportion to its fit to the preference. I furthermore believe that this preference somehow leads to a motivation, a sort of desire, to achieve one’s preference.

I think this is simultaneously rather basic, carricatural, and obvious. But it gives us a basic foundation: for the purpose of measurement, we might equate the sexual preference with the arousal pattern. We then model that propensity for sexual arousal to something causes general desire for this thing, but that the general desires might be affected by other factors to.

In order to capture that there is a single system underlying all of this, it might perhaps be reasonable to propose that there is a single coupling constant k, such that:

desire for x = k * arousal to x + other factors

It turns out that this hypothesis makes some very strong predictions, much stronger than you usually see in psychological research. We can get into this mathematically.

(I guess feel free to skip the next bit if you’re not strong on math… 馃槄)

Specifically, suppose that arousal to x has a variance of A. Then desire for x ends up with a variance of k2A from the arousal, as there will be differences in wanting x depending on how arousing one finds out. This will also imply a covariance of kA between arousal and desire. However, in addition to this, desire for x may have some additional variance B’ due to other factors not related to arousal. (E.g. homophobia may reduce interest in same-sex partners, extraversion might increase desire for partners, femininity might increase desire to be a woman, and so on.) So the total variance of desire for x, which we will label B, can be written as B = k2A+B’. Using the variances and the covariance, we can then compute the correlation: r = kA/鈭(AB) = k鈭(A/B).

Why are point predictions important?

The expression that I derived for the correlation above predicts the exact strength of the correlation between a sexual interest and its corresponding desire. (Sort of – it makes some unrealistic assumptions that would need to be addressed to get anywhere close to exact results.) This is unusual in psychology; typically, psychological research involves grabbing some folk commonsense guess at how psychology works, making directional (i.e. positive/negative) predictions about some correlations, and then testing those directional predictions.

There’s a lot of problems with this. First, directional predictions are generally just going to be right half the time; there’s only two directions the association can be, so you can only do weak hypothesis tests with them. Furthermore, because there’s generally content overlap between the things one makes predictions on, the probability that there is a directional effect is going to be even greater. Thus, predicting the strength of the correlation is a much stronger hypothesis test.

But even more importantly: the equation I derived for the predictions treats the two directions of causality differently. The correlation increases with the variance in the cause, and decreases with the variance in the consequence; thus, assuming that one has multiple parallel systems, point predictions allow testing the direction of causality.

Now, the equation I derived this for assumes a rather simple linear effect. Maybe (almost certainly…) sexuality is more complicated than this, and so the equation won’t hold in practice. But if one studies sexuality more carefully, maybe one can find some model that does hold more precisely. At least, if one can’t, then that casts doubt on the assumption that sexuality forms a consistent system that one can generalize across. (Almost by definition.)

Autophantasic sexuality

So basically, an approach could work as follows: take some other sexual interest with different variances than autogynephilia, use this sexual interest to fit the parameter k, then apply k to the case of autogynephilia and check whether it gives the right point prediction. In practice, this is a bit optimistic; we’d probably want to run many sexual interests in parallel to better check the robustness and validity of the theory asserting that sexual interests are a real thing, and we probably wouldn’t expect an exact fit just because the assumptions made are a bit optimistic. But this is the general approach.

In practice, it gets a bit more complicated than that. What corresponds to what? For instance, when asking about sexual attraction to women, should one consider arousal to the nude female body, arousal to having sex with a woman, or something else? And when one asks for desire for women, should that be desire for a girlfriend (and if so, what defines “girlfriend”? just everything the local culture includes?), desire for sex with a woman, or something else? How does this generalize to submissive bondage, should desire for bondage refer to desire to be tied up, or desire to have a master who will tie oneself up, or what?

These are the sorts of questions that research into the construct of sexual interests should start tackling, in order to increase understanding. But for now, my solution is simple: restrict the investigations to the narrower case of “autophantasic sexuality”; i.e. sexual interests in being something specific. This “being” something can still be broad; it covers e.g. furries, ageplay, and much more. So:

Definition: If X is any trait that one can have or any category one can belong to, then we consider the propensity to get sexually aroused by the thought of being X to be an autophantasic sexuality, which we label autoXphilia. There is some coefficient k independent of X such that autoXphilia causes a desire to be X.

Autophantasic sexuality resembles the concept of an “erotic target location error” (a sexual interest that has been inverted onto oneself), but unlike the case of ETLE, it does not require attraction to the original target.

It’s sort of premature to test it, as one would need a good measure of wanting to be X as well as a good measure of sexual arousal to being X. But my simulations suggested that this wouldn’t be overly sensitive to violations of assumptions/bad data quality/etc., so I went ahead anyway. To test this model, I collected a broad range of Xs:

  • Woman
  • Child
  • Furry (described as “an anthropomorphic animal (i.e. keeping a roughly human form, but having fur and animal features)” to participants)
  • Amputee
  • Extremely fat
  • Asian
  • Black
  • Androgynous
  • Exaggeratedly muscular
  • More attractive
  • Extremely rich
  • Nudist (“living nude in your everyday life, in a society of other nudists”)
  • War hero
  • Famous left-wing activist
  • Manic (“extremely excited, hyperactive, prone to grandiosity, distractable, impulsive鈥”)
  • Android (“fully-functional human-shaped robot”)
  • Extremely skinny

There’s some comments that I should make on this list. My assumption would be that some of these sexual interests would not exist; for instance I have never heard of auto[left-wing activist]philia. However, it is intentional that I have included these. If my assumption is right that the sexual interests do not exist, then the variance in the sexual interest would be ~0, which would mean the correlation between the sexual interest and the desire to be a member of the category would be ~0. (In practice I fit using the covariance rather than the correlation to avoid worries about dividing by zero.) On the other hand, if my assumptions are wrong, then including counterexamples like these help prove that the assumptions are wrong.

Another important thing is to consider interests with different properties. For instance, my assumption going into this was that a primary factor in wanting to be an amputee would be sexual; after all, why else would one want it? So my assumption was that for being an amputee, there would be high variance in the sexual interest, and lowish variance in the general desire. Meanwhile, for something like being a left-wing activist, I assumed the opposite pattern; how appealing this would be would be heavily dependent on how left-wing one would be, for instance.

These are a lot of assumptions, which brings a third important thing. I don’t know which of the assumptions are true or not before I collect the data (and sometimes not even after it); that’s why they’re assumptions rather than proven facts. Thus, it is important to study a large number of potential interests at the same time, in order to achieve robustness against violations of assumptions.

Finally, what was my plan for measurement: Simple, for each of the possible things one could be, ask people how appealing they would find it on a 5-point scale from very unappealing to very appealing, as well as how sexually arousing they would find it to imagine being this on a 5-point scale from not at all to very. This is a really terrible measurement; it is discrete and contains a ceiling as well as a floor. Thus, another important check against robustness is to include targets that bump against the limits of the measurement; for instance I would assume ~everyone wants to be rich, so it seems conceivable that being rich would bump against the ceiling of the measurement, which violates the assumptions made in the calculation. In order to find out how serious that sort of violation is, I have included extremes like this.

Initial results

I did the first round of this survey on Prolific a few weeks ago. The results were not very convincing. To explain, let me introduce a strange kind of diagram:

Unusual diagram I came up with for this sort of hypothesis testing/causal inference.

The equation r=k鈭(A/B) predicts a relationship between the variance ratio of autophiilia/desire to be, and the correlation between the two variables. As such, to get an overview of the results, one can do a scatterplot, showing how each of the autophantasic sexualities fit in. If the theory is right, they should all be on a line. I wanted to plot this, but A/B is unbounded, so to make it better behaved, I instead plotted A/(A+B), which is essentially similar, except that it ranges from 0 to 1 and thus more neatly fits into a plot. The consequence of this is that the interests would be expected to lie on nonlinear curves, rather than straight lines.

I’ve fitted two constants, one for autophilia causing desire, and the other for desire causing autophilia, and plotted the curves associated with these constants; the orange curve represents autophilia causing desire, while the green curve represents desire causing autophilia. As can easily be seen, they both make very distinct predictions, but they are also both very very wrong.

More specifically, it appears that the interests lie on a sort of upside-down U shape; it starts at the bottom-left with attractive/rich, continues up to the top-middle at fat/muscular, and then goes down to the bottom-right with left-wing activist/child. Meanwhile, the autophantasic sexuality model would predict that you just see a decrease, which fits badly with attractive/rich, while the reverse causality would predict that you just see an increase, which fits badly with left-wing activist/child.

I see this plot as being a good illustration of my point that I needed a broad set of interests to test my assumptions. Some assumption were wrong; e.g. the position of amputee and left-wing activist wasn’t much different, contrary to what I predicted. And for seeing the shape of the curve, it helped that I had a broad set of items; if I had e.g. excluded being rich from the test, it would be unclear if “attractive” was just an outlier.

One thing I have previously found by experimenting on Prolific is that the results are not as consistent as I would like. Just because people say one thing one time does not mean they will say the same thing another time. So to see what fraction is due to the persistent aspect of people’s responses, I did another survey with the same participants one week later to extract only the component of their responses that was consistent across surveys. That yielded the following plot:

Correlations and variances after accounting for the inconsistent fraction of participants’ responses.

The main thing this changed was the effect sizes involved. We still have the general upside-down U shape, and the rough placements of different items is still the same. Clearly the data doesn’t support the simplified approach I’ve taken.

Ordinal data, interval data, ceiling effects

I believe the biggest problem is in measurement. Everyone wants to be richer and more attractive. As a result, the two options, “appealing” and “very appealing”, are not very good for distinguishing people; whether one thinks being rich would be super great but not worth the effort, or whether one has it as one’s biggest goal in life to be rich, is not something that this measurement distinguishes; they would both go under “very appealing”.

One can talk about these problems more formally. The trouble is that I am working with “ordinal data”; I have an ordered set of categories that people can respond that they belong in, but distances between the categories, or widths of the categories, are not really well-defined. Thus, it is mathematically incoherent for me to talk about the “variance” in people’s responses.

In fact, how do I compute this variance? The standard way: I assign each response option an integer from -2 to 2, and then compute it using these numbers. That is not mathematically valid. (People do it all the time in psychological research, but usually their methods are not as sensitive to the variances of the variables as my method is.)

What I need is “interval data“; data where concepts like variance and relative differences are mathematically meaningful. Otherwise, I will run into “ceiling effects”, where highly varied responses might be “squished” together into a small variance in the quantitative data. You can think of temperature as being an example of interval data; differences in temperature are quantitatively meaningful, such that it makes sense to talk about a difference of 10 degrees Celsius. (Meanwhile you can’t talk about differences on a scale from “very unappealing” to “very appealing”; it’s not clear that the distinction between “unappealing” and “neutral” is the same as the distinction between “appealing” and “very appealing”.)

So can one do that? Measure preferences in interval data? What does that even mean, when preferences are mainly defined by an ordering (inherently ordinal!) of what one wants?

One possible answer to this comes with rational decision theory; it turns out that someone who deals rationally with uncertainty can be proven to have preferences defined in intervals rather than just ordinally, because they must be able to take weighted averages when dealing with uncertainty.

Thus, if one makes the assumption that humans are rational, one can extract their preferences by hearing what tradeoffs they would choose if there were possible random chance involved. Roughly speaking, if one has some worst scenario D (e.g. death), and some best scenario B (e.g. being rich, popular, attractive, etc.), then people’s preference for any scenario Q can be defined as the probability p such that they are indifferent between having Q, and having p probability of B and (1-p) probability of D. This does run into the problem that humans are not fully rational, and one would probably need to find some way of dealing with this.

A bigger problem is that while this allows intrapersonal preference comparisons, it does not allow comparing one person’s preferences with another’s, which is necessary for computing correlations and variances across people. Specifically, if people vary in how much they value the best scenario relative to how much they value their sexuality, then comparing relative to their best scenario would be a problem. It seems like it should be possible to handle this, though; rather than standardizing people’s preference measurements in terms of the best and worst imaginable scenarios, one could standardize them in terms of points that are relevant for this research, like the importance of having a sexual relationship.

The main problem I see here is that measuring preferences properly seems pretty expensive. The badly measured study I did on Prolific was already quite a bit more expensive than any research I’ve done before, and proper measurement would presumably make it much much more expensive.

But I think it would be worth it! If these sorts of methods can genuinely settle controversies like autogynephilia (as well as, IMO, many other psychological questions), then the expense may be worth it, at least compared to other methods that mainly symbolically test things.

Is there a need for a psychophysics of autogynephilia?

The main measurement problem I worry about is in measuring preferences. But a secondary one I worry about is in measuring and defining autophantasic sexuality. More specifically, I got quite high endorsement rates to some sexual interests I did not expect to get high endorsement rates for, such as being rich.

Maybe this is just common. Or maybe it’s uncommon, but common on Prolific (after all, autogynephilia is also fairly common on Prolific). But due to some other questions I have been interested in (such as defining autogynephilia in such a way that it is validly measurable in women), there’s a pathway of investigation that seems worthwhile, namely: What does it mean to be aroused “by” something?

I find it hard to communicate the issue. It’s not a question about causality per se; I’m fine with taking causality as a concept for granted. I think we can meaningfully consider different sexual scenarios and define how arousing each of them is. The issue comes to when we want to abstract the arousal to be defined by features of these scenarios, rather than simply considering specific scenarios.

I think there are a number of paradoxes that can be raised. For instance, if someone is attracted to people regardless of their sex, we say that they are bisexual and aroused by both men and women. However, if someone is attracted to people regardless of their hair color, we don’t say that they are panhairsexual and aroused by all hair colors. Is that a purely linguistic paradox? I don’t know, it doesn’t feel like it, but I’ve thought of various solutions and they didn’t seem convincing.

One can also pick a paradox more specific to AGP; if a man is aroused by the thought of engaging in sexual activities as a woman, then we say that he is AGP, even though we don’t say that he is AAP if he is aroused by the thought of engaging in sexual activities as a man. One might argue that this categorization is overly broad, and that being a woman must be arousing independently of the other things for it to be true AGP; however, this does not seem viable to me, as in surveys generally men who report being aroused by the combination of being a woman with other things also report being aroused by the thought of being a woman on its own; any viable theory of AGP needs to account for this connection.

I don’t think I have a full account for how to interpret this. However, I don’t think we need to assert one. Rather, the appropriate method is to study it empirically. Ideally we should be able to examine people’s arousal to a variety of concrete scenarios, and abstract them into a smaller number of parameters, such that those parameters well match their actual interest in the scenarios. This should ideally be done very precisely, with point predictions, as this allows more specific test of differing theories. This sort of research is called psychophysics, and my understanding is that in psychophysics, the issues related to ordinal scales that I mentioned before aren’t as problematic.

If the exact relationships for AGP were understood, one could also use this to better delineate and measure other forms of autophantasic sexuality. One could meaningfully restrict oneself to only the forms that match AGP in shape, and one could perhaps also get a better understanding of what it is that one is measuring.


Overall, I think autogynephilia research could benefit from a greater focus on foundations. In fact, this is probably not just limited to autogynephilia research; all psychological research needs a better understanding of the low-level foundations. The big problem is that it is expensive and slow, and the measurement is difficult. However, it seems like the only way to make solid progress.

There are alternative ways to make progress. Behavioral genetics manages to do a lot with twin studies, sometimes one can use instrumental variables, and so on, but these methods are generally extremely specialized and fiddly, and usually disregarded as a result. Meanwhile, zooming in to the narrower, more specific case allows solid progress to be made, as long as one actually starts from a valid concept.

That’s not to say that there isn’t space for more usual kinds of research. However, they should really be seen as exploratory research that does basic sanity-checks of hypotheses, rather than the confirmatory research that a lot of studies present it as being.

Response to Fleming on ETLE

Rod Fleming recently did a video criticizing me as well as others, so I thought I would respond to it. Like with my critique of Contrapoints, I jotted down some notes while watching it, which I’ve written up as a cleaner critique below. Let’s start with a summary of the biggest points:

There’s an important point where we agree: Autogynephilia is not defined by behavior or other surface-level phenomena. I argued for that recently in my blog post Is autogynephilia real? The phenomenon, the construct, the theory. (Perhaps ninja’ing Fleming? He might’ve been working on his video when I posted my blog post.) I recommend reading the blog post for more info, but roughly speaking, I defined autogynephilia as a sexual interest which accounts for stereotypically-autogynephilic phenomena and which is due to an inversion of gynephilia.

Another point where we agree: This sort of strict definition represents a challenge for research like Moser’s which examines AGP in cis women, or for my research which examines AGP in gay men. What I think Fleming underestimates is the degree to which it represents a challenge to Blanchard’s research, while overestimating the degree to which “inversion of gynephilia” has been documented to be a cause of autogynephilia.

It should also be noted that I think Fleming misunderstands why I study AGP in cis women. I’m not trying to prove that autogynephilia “is normal”, whatever that means. Rather, I see some potential things one could learn from it; e.g. it might teach us something about how to measure autogynephilia in post-transition trans women, about how to define autogynephilia, and it might help me write a response to Serano. Furthermore, anyone who is paying attention to the evidence on this topic can easily see that Blanchardians are in deep denial about this phenomenon, so if one wants to learn something that goes beyond Blanchardian’s thoughts, this would be a great place to dig in.

How well-proven is ETLE?

Rod Fleming relies a lot on the concept of an erotic target location error, stating that this is a defining feature of AGP. And I agree… sort of.

Let’s go back to the initial invention of the concept of autogynephilia. People were trying to understand the nature of transsexuality, and they had accumulated a variety of phenomena. Some patients seemed, informally speaking, as if they were “women in men’s bodies”. Some patients seemed like homosexuals who had developed gender issues. Some had a long history of transvestic fetishism, and these were often also attracted to women. In addition to transvestic fetishism, there were a variety of other genderish sexual phenomena that had been observed. How is one to make sense of this?

Over time, a whole bunch of theories had been developed. Some saw the transvestic fetishism as being obviously a consequence of homosexuality, others saw it as a way of dealing with insufficient masculinity, and still others thought it to be an inversion of attraction to women. This concept of an inversion seems to pre-date Blanchard; for instance, Havelock Ellis coined the terms Eonism or sexo-aesthetic inversion with a similar-sounding reasoning.

Blanchard is gay, and I bet that probably made it easier for him to realize that autogynephilia wasn’t a form of homosexuality. Blanchard (and Freund?) instead came up with the idea of an erotic target location error, that an inverted attraction to women explains sexual phenomena like transvestic fetishism and motivates gender dysphoria and transition. This was part of his typology which organized the mess of transsexual phenomena into two types, autogynephilic and homosexual.

But how well-proven is this ETLE idea? Fleming makes it sound very well-proven, talking about long and complicated measures used to diagnose them, but I don’t think that’s accurate. It might be worth looking at the studies which examine it:

  1. Blanchard (1991), Clinical observations and systematic studies of autogynephilia
  2. Blanchard (1993), The she-male phenomenon and the concept of partial autogynephilia
  3. Freund and Blanchard (1993), Erotic Target Location Errors in Male Gender Dysphorics, Paedophiles, and Fetishists
  4. First (2005), Desire for amputation of a limb: paraphilia, psychosis, or a new type of identity disorder
  5. Lawrence (2006), Clinical and Theoretical Parallels Between Desire for Limb Amputation and Gender Identity Disorder
  6. Lawrence (2009), Erotic Target Location Errors: An Underappreciated Paraphilic Dimension
  7. Kolla and Zucker (2009), Desire for Non-Mutilative Disability in a Nonhomosexual, Male-to-Female Transsexual
  8. Lawrence (2009), Anatomic Autoandrophilia in an Adult Male
  9. Hsu and Bailey (2016), Autopedophilia: Erotic-Target Identity Inversion in Men Sexually Attracted to Children
  10. Hsu, Rosenthal, Miller and Bailey (2017), Sexual Arousal Patterns of Autogynephilic Male Crossdressers
  11. Hsu and Bailey (2019), The “Furry” Phenomenon: Characterizing Sexual Orientation, Sexual Motivation, and Erotic Target Identity Inversions in Male Furries
  12. Fuss, Jais and Grey (2019), Self-Reported Childhood Maltreatment and Erotic Target Identity Inversions Among Men with Paraphilic Infantilism
  13. Brown and Barker (2019), Erotic Target Identity Inversions Among Men and Women in an Internet Sample

In addition to these, there are the studies that find relationships between autogynephilia and sexual orientation among transsexuals, such as Blanchard (1989) and Blanchard (1985).

Now, given these studies, what would we need to find for Rod Fleming to be right?

First of all, we would need to find evidence that erotic target location errors are a thing that occur, i.e. that an allosexual interest can be inverted such that it causes a sexual interest in being the target. Proving the causality in this is somewhat difficult, but this at least gives some correlation patterns we would expect to see.

Next, Fleming claims that this is the only possible cause of autogynephilia. This is not necessarily an empirical question, as Fleming claims this by definition; but if so, given endorsements of typologies of transsexuality, we would expect that such typologies rule out the possibility that alternative “pseudoautogynephilias” which might be caused by other factors than ETLE cannot cause gender issues; or we would at least expect to see some sort of differential diagnosis, considering Fleming’s claims of long and complicated tests.

So, are these supported by the studies? Sort of/not really.

It might be tempting to count the transsexual studies which find that non-HSTSs have more autogynephilia (or at least, autogynephilia-like phenomena) than HSTSs as evidence for a correlation between the allosexual and autosexual target. However, that would be obviously wrong, due to Berkson’s paradox; while these studies to an extent test for a negative correlation between sexual orientation and autogynephilia, they also implicitly test for autogynephilia and homosexuality both causing gender issues, and they test for autogynephilia and homosexuality being associated with gender issues under distinct conditions. To understand in greater detail why they test for all three, I recommend reading Age of onset as the origin of discrete types of gender dysphoria; but roughly speaking, Blanchardianism relies on all three effects being in play, and so tests of only transsexuals will not be able to identify that any given effect is in play.

But I do buy there being a correlation between the two. I don’t think the auto/allo correlation has been well-documented for autogynephilia in the literature, so I’m not sure how Fleming makes that implication (possibly he does so by misinterpreting the transsexual studies?), but I find autogynephilia and gynephilia to be correlated, and the various studies of other conditions (like apotemnophilia, autopedophilia, and furries) that I listed tend to find correlations too, so it appears to be a general rule.

The main problem I have with this is that the rule appears to be too general; whenever you have two sexual interests with content overlap, it appears you find high correlations between them. For instance, you find correlations between masochism and sadism, or between exhibitionism and voyeurism. Are these predicted by ETLE? If not, could whatever causes this provide an alternative explanation to ETLE for correlations between internal and external erotic targets? Hard to say, because ETLE is understudied.

Either way, do studies do differential diagnosis to identify true ETLEs? … Not really. I mean, certainly Freund and Blanchard (1993) do claim that there is a distinction, where apparent autopedophilia could instead be motivated by masochism. But they don’t empirically distinguish their consequences, e.g. they don’t show that masochistic pseudoautopedophiles are unable to end up with age identity disorder as a result. Similarly, Brown and Barker (2019) make this distinction, but they don’t use the distinction for anything.

Fleming’s claims of “long and complicated” scales from diagnosing it also aren’t very accurate. Blanchard has created multiple scales for assessing autogynephilia, such as the Core Autogynephilia Scale, or the Cross-Gender Fetishism Scale, but they simply consist of lists of behaviors that one could endorse; they hardly support his point that diagnosis requires more care than just considering autogynephilia as a behavior. Furthermore, the only nonlinearity in the scales involves shortening the core AGP scale if the participant doesn’t report any fantasies of having been a woman. So ultimately I don’t think Fleming is accurately describing the diagnosis here.

And you know, maybe that is a problem! This is arguably pretty much what I complained about in my phenomenon/construct/theory post. Maybe we should use more careful methods for diagnosing autogynephilia, but in that case the entire Blanchardian theory falls together with my studies on autogynephilic gay men, which is a point I don’t think Fleming has appropriately considered.

Autogynephilia in gay men

I claim some gay men are autogynephilic. Philosophically, this puts me in a pickle, because this is in contradiction with defining autogynephilia as being the condition that comes from an inversion of gynephilia.

One way to interpret this is that one could interpret this as a purely linguistic, definitional disagreement; I think the concept of autogynephilia should be broadened to contain non-ETLE phenomena that look similar, and I think we should understand that sometimes gay men exhibit this.

Alternatively, I’d instead suggest another interpretation: The original definition of autogynephilia makes certain assumptions, and these assumptions have turned out to be incorrect. Thus we need a new model to replace it. Specifically, in my phenomenon/construct/theory post, I characterized the conventional definition of autogynephilia as stating that autogynephilia:

  1. is the single common cause underlying stereotypically autogynephilic phenomena,
  2. represents a sexual interest, analogous to others like heterosexuality or fetishism,
  3. can be caused by an inversion of gynephilia,
  4. cannot be caused by anything without gynephilia.

What I noticed was that there were some gay men who engaged in stereotypically autogynephilic phenomena. This is quite simply not compatible with the definition lined up by 1-4. There are a number of ways one could fix this:

a. Reject (4); claim that autogynephilia can be caused by something else too.
b. Modify (1) to state that certain other sexual interests can cause stereotypically autogynephilic phenomena without being truly autogynephilic.
c. Modify (1) even more, such that things that are not sexual interests can cause stereotypically autogynephilic phenomena.

I’m not sure what Fleming says to this; he didn’t really go into detail about the apparent autogynephilic homosexuals. Certainly I’ve seen a lot of people go for option (c), arguing that this originates from gender issues, internalized homophobia, or all sorts of other things. I’m not really going to address (c) much, except to say that those who go for it should also do some philosophical legwork to address how they know that these alternative causes can’t also be the causes of autogynephilia in most gynephilic transsexuals. For instance, if one proposes that gender issues can cause autogynephilia-like sexual interests in gay men, why shouldn’t they be able to cause autogynephilia-like sexual interests in straight men too? If they can, how do you save Blanchardianism from the resulting issues of causal direction?

Anyway, assuming we’re going to reject (c) for the same reason we reject similar theories in gynephilic individuals, that leaves (a) and (b). I advocate both (a) and (b); specifically, I advocate introducing a distinction between broadsense autogynephilia, which we get by (a), and narrowsense autogynephilia, which we get by (b). Broadsense autogynephilia represents any sort of sexual interest in being a woman; narrowsense autogynephilia represents an interest in being a woman due to an inversion of attraction to women.

r/Blanchardianism - Multi-type AGP hypothesis
Relationship between narrowsense AGP and broadsense AGP. For more, see Multi-type AGP hypothesis.

Furthermore, I propose that most of what we traditionally associate with autogynephilia – such as its effects on gender feelings – is associated with broadsense AGP. Well, except when it’s associated with AGPTS instead or something. But basically, this is my justification for considering broadsense AGP to be the more important variable, because that is the one we would generally be paying attention to in downstream theories (though upstream theories would presumably pay more attention to the narrowsense AGP vs other types distinction).

Autogynephilia in cis women

Part of Fleming’s motivation for his video seems to be to critique studying AGP in cis women, so I think I should address this too.

Fleming seems to reject AGP in cis women primarily for two reasons. First, the whole ETLE discussion we’ve just had, and secondly, by appealing to it being “normal” for women and abnormal for men.

I see “normal” as a bit of a problematic word, as it lumps together meanings like “common”, “morally OK”, “harmless” and “evolutionarily selected” into a single word, despite them being distinct. Consider:

Strong AGP in mennoyesdepends1no/against
Wanting to rapesort of?4nonolikely for5
Female independenceyes
(for now6)
actively good
Severe disabilitiesnoyesnoagainst
Very high intelligencenoyesyes,
actively good7
See footnotes for more details.

There’s some relationships between these different aspects; if something is harmful and morally blameworthy then we might consider them not OK. And if something is not OK, we might try to reduce its prevalence. Things that are selected against will obviously reduce their own prevalence, and furthermore our preferences were created by selection, so all else equal we would expect to consider things that are selected against to be harmful.

But… none of these are really relevant for the question of whether cis women might (realistically only sometimes/rarely) be AGP. Consider homosexuality in men as an example; it’s not “normal”, as it is selected against, but it wouldn’t be ridiculous to propose that exclusive androphilia is a thing that one can coherently talk about existing in women. So when I see a cis woman who writes something like:

Sometimes when I look in the mirror after shower, or when I have a good day, or I am just aroused – then I like to look at my naked body, my waist, breasts, just like this. And on top of that I especially love my hair, they are beautiful, brown, golden, auburn. And my beautiful blue eyes. Then I feel like a goddess. And it is arousing.

… then it’s not unreasonable to wonder if she might be autogynephilic, even if she doesn’t have to deal with the same downsides as men do.

That’s not to say that autogynephilia in women affects the calculus of normality for autogynephilia in men much. But that’s not why I’m looking into autogynephilia in women. (Admittedly this, or variations like normality for autogynephilia in trans women, is something that others are interested in studying AGP in cis women for. But Fleming needs to pay attention to what I am actually saying when he critiques me.)

Rather I see some alternative benefits:

  • The measurement of autogynephilia in men is pretty straight-forward. You can just ask them whether they are aroused by the thought of being women, or ask them whether they imagine being women in sexual fantasies, or lots of other options, and you get a relatively OK measure. (Not perfect, of course, but fine.) But it’s not obvious that these would work on fully-transitioned AGP trans women, due to lots of factors, one of which is that they might be confused about the question. But plausibly autogynephilia in fully-transitioned trans women would function similarly to autogynephilia in cis women, so understanding how it functions among them can help improve our understanding of it in trans women (who are harder to study for many reasons).
  • More generally, there are some ambiguities in how to define autogynephilia. Is any sexual fantasy in which one imagines being a woman autogynephilic? Presumably not, considering cis women. But then, where do we draw the line? Perhaps cis women could help us define this more precisely.
  • Many critics of Blanchardianism invoke autogynephilia in cis women as an argument. It can be hard/awkward to respond to this without having a good understanding of autogynephilia in cis women, so I need to research it.
  • Many Blanchardians are obviously biased against the idea. Fleming is biased, as seen in his video. Blanchard and Dreger are biased. Bailey is biased. Whenever I see bias against some idea, I start getting a tingling that maybe I should look more into that idea to learn more.

Just to clarify, I haven’t settled on an all that specific opinion about AGP in cis women. This is what I wrote last I got asked about it, but I drift around a bit in my thoughts on it over time.

Other stuff

There’s a couple of other minor points:

  • Fleming argues that autogynephilia is narcissistic, which seems unfounded. Fleming argues that autogynephiles have especially beautiful wives, which seems unfounded.
  • Fleming argues that autohomoeroticism is a social contagion, appealing to Blanchard. But Blanchard distinguishes between AHE and ROGD.
  • Blanchard himself acknowledges that there is at least one homosexual autogynephile, so Fleming’s argument that it is definitionally impossible doesn’t seem to be accepted by Blanchard.

Is autogynephilia real? The phenomenon, the construct, the theory

Autogynephilia is a sexual interest in being a woman. Some people argue that autogynephilia isn’t real, often citing Serano’s The Case Against Autogynephilia to support it. I’ve previously counter-argued that autogynephilia is real, but recently I’ve taken a deep dive into psychometric theory that makes me want to revisit the topic in a more nuanced way.

Critics of autogynephilia – or at least, the well-informed critics of autogynephilia – rarely argue that no males find sexual fantasies in which they are women arousing, that transvestic fetishism isn’t a thing, or things like that. These are sufficiently obvious that it would be crazy to deny it. Rather, they argue that these are not sufficient for autogynephilia to be real, but instead that the surrounding theory about autogynephilia is inaccurate, and this makes autogynephilia not real. For instance, quoting Serano:

As others have noted, conflation between the descriptive and theoretical definitions of autogynephilia has lead to a great deal of confusion in the literature on the subject (Wyndzen, 2005). For example, when an author describes an individual as an autogynephilic transsexual, are they simply stating the fact that the individual has experienced 鈥渁utogynephilic鈥 fantasies in the past? Or are they suggesting that the individual suffers from a paraphilia and became gender dysphoric as a result of such fantasies? To avoid this problem, throughout this article, I will use the term cross-gender arousal to describe sexual arousal that occurs in response to cross-dressing or imagining oneself being or becoming a member of the sex other than the one they were assigned at birth, and I will use the term autogynephilia exclusively to denote the paraphilic model that Blanchard and others have forwarded.

While nobody seriously doubts the existence of cross-gender arousal, there has been considerable debate about autogynephilia. The aspects of the theory that have garnered the most contention are its claims that (a) transsexual women come in two (and only two) subtypes鈥攁ndrophilic and autogynephilic and (b) the assumption of causation鈥攖hat a 鈥渕isdirected heterosexual impulse鈥 causes cross-gender arousal, which then subsequently causes gender dysphoria and a desire to transition. While numerous critiques of the theory exist, proponents of autogynephilia have attempted to play down the significance of these critiques on the basis that they were not published in the peer-reviewed literature (Bailey & Triea, 2007; Lawrence, 2007). Here, drawing on these previous critiques, I argue that autogynephilia theory is clearly incorrect. I also discuss how the typology and terminology associated with the theory needlessly sexualizes MtF spectrum people and exacerbates the societal discrimination this group already faces.

One the one hand, I think saying “autogynephilia isn’t real” to attack ideas that are extremely peripheral to the concept of autogynephilia, such as whether there are exactly two types of transsexuals, is extremely misleading and confusing. This isn’t something I’m accusing Serano of doing in this quote, but this is something people who cite her often engage in, and they should stop that.

On the other hand, I can’t entirely reject the point that the validity of a concept inherently depends on the theory surrounding it. Probably the best exposition I’ve read on this is Scott Alexander’s review on Kuhn (probably Kuhn’s writing themselves, as well as later work based on them, are all great places to also read about this). Basically, when doing science, you typically work within a paradigm which dictates the concepts that are relevant to consider, the ways in which things generally work, and the methods of solving questions. There’s no guarantee that concepts that make sense in one paradigm make sense in other paradigms, and generally if someone is working from different school of thought, they might interpret the same object-level observations completely and utterly differently.

So is that the end of the discussion? Either one accepts the entire Blanchardian paradigm, in which case autogynephilia is real, or one is totally justified in including the claim that autogynephilia isn’t real in one’s overall take on things? No incremental progress, just everything or nothing?

I think I have an alternative: Blanchardians usually cite autogynephilic phenomena to defend the concept of autogynephilia, while skeptics usually critique the broader theory of autogynephilia, but inbetween the two extreme ends, we can identify the construct of autogynephilia as covering a sexual interest that explains the phenomenon while being distinct from the theory. Let me explain:

The phenomenon

In my earlier post, I gave lots of examples of the phenomenon of autogynephilia. For instance, I referenced sexual AGP communities (NSFW, hereherehere), trans women reporting autogynephilic sexuality (here, here), autogynephilic men being gender dysphoric (here) and so on.

The phenomenon of autogynephilia is generally identified as being males who engage in sexual fantasies or behaviors based on imagining themselves as women or associating themselves with feminine things. Blanchard observed five kinds of autogynephilia:

  • Anatomic autogynephilia, sexual fantasies and play with having female anatomy.
  • Interpersonal autogynephilia, sexual fantasies and play involving being admired as a woman and having sex with men as a woman.
  • Transvestic autogynephilia, dressing as a woman in sexual contexts.
  • Behavioral autogynephilia, sexual fantasies and play involving behaving as a woman.
  • Physiological autogynephilia, sexual fatnasies and play involving having female physiological functions, such as pregnancy, menstruation, or sitting while peeing.

Realistically, I don’t think these five kinds are a particularly accurate view into how autogynephilia-the-phenomenon works; out of the five top fantasies found in my qualitative survey, only one (“heterosexual sex”) is present in the standard measure of these five types of AGP. So most likely these five kinds of autogynephilia are kinda misleading when it comes to what the phenomenon of autogynephilia really is. But whatever, this isn’t the biggest problem, though it is to some degree a problem, and I will return to that later.

Anyway, the key pattern here is that this is all observational. Notice, for instance, that while in the beginning of the post, I defined autogynephilia as a”sexual interest in being a woman”, but for describing autogynephilia-the-phenomenon, I merely say that it consists of “fantasies and play”, to avoid deeper theoretical conclusions. We observe certain sexual fantasies, we observe certain correlations, we observe lots of things. But “observing” things cannot be a real theory, as it doesn’t tell you how things work; it might be useful for prediction perhaps (“autogynephilia is a symptom of gender dysphoria”), but correlation is not causation. Many different causal theories can be proposed to understand this phenomenon of autogynephilia, and some of the popular causal theories end up treating it as pretty much irrelevant.

The construct

So if autogynephilia-the-phenomenon isn’t sufficient, then what is? The construct of autogynephilia is what makes it start seeming much more relevant, and it’s one of those “hidden assumptions” that might, at least for Blanchardians, be so obvious as to rarely get stated. But I think it’s worth making it explicit. It states that there is a trait, “autogynephilia”, which functions as follows:

  1. It can be understood as a common pathway which is the cause of autogynephilia-the-phenomenon.
  2. It represents a sexual interest, analogous to other sexual interests like gynephilia or fetishism.
  3. Implicitly, in order to represent a sexual interest, the concept of “sexual interest” itself must be coherent and cover the things we wish to place under it.

This is actually really vague. The reason that it is vague lies in point 3; it lacks the definition for what a “sexual interest” is. Here, it is common for Blanchardians to define it as an arousal pattern (see e.g. here for argument), and I agree (with some qualifications, e.g. presumably the arousal is moderated by libido, but the orientation is the effect that exists before the moderation by libido), but for this to make sense, it presupposes an entire theory of sexuality that needs to be explicated. I’d say it goes something like this:

The core of people’s sexuality consists of an “orientation”, which to each possible interest assigns some sort of sexual value. This orientation is reflected in people’s arousal, as they become aroused to the interest in proportion to the sexual value and their libido. Furthermore, also moderated by libido, their orientation creates a sexual motivation to seek out their interest, which affects desires/behaviors/compulsions, though these can also be affected by other factors, such as social norms and pragmatics. The motivation also affects sexual fantasies, but sexual fantasies are likely less influenced by social norms and pragmatics.

This is far from a perfect theory of sexuality. There are likely numerous sexual phenomena that it fails to address; the ways it currently addresses the phenomena are vague and not quantitatively precise; important parts of the theory are dubious (rather than orientation 鈫 motivation 鈫 fantasy, could one not imagine orientation 鈫 fantasy 鈫 motivation, where the fantasy functions as a sort of “conditioning”?); and all of the theory is unproven. I will return to that later, but for now, just consider it a hypothetical example of what a solution to (3) might look like.

This leaves (1) and (2). The trouble with them is that they are both tightly coupled to causality, which makes them difficult to evaluate. I have some thoughts on how to show the causality, which I will return to later, but for now, one good starting point might be to evaluate their surface-level plausibility. For instance, Hsu found possibility (1) to be plausible, as when he collected items that covered different types of autogynephilia, he found them all to be correlated, as would be predicted if autogynephilia represents a common cause, and he found his items to be good at distinguishing between ordinary men and men who are involved in autogynephilic groups (in the sense of autogynephilia-the-phenomenon).

But even if one demonstrates that (1) is not obviously wrong and pretends that “not obviously wrong” = strong support, that still leaves (2). To properly validate autogynephilia-the-construct, one would have to show that it represents a sexual interest, which involves first figuring out what “sexual interest” means (as in point (3)), and then show that this plausibly applies to autogynephilia (as in point (2)).

I’m not aware of any time this has been done. Anne Lawrence has sort of played around with it, for instance in Becoming What We Love, or in sections of Men Trapped in Men’s Bodies, but my impression is that her goal here was more to illustrate aspects of autogynephilia, rather than to evaluate whether autogynephilia constitutes a sexual interest.

There are parts of the critiques of autogynephilia that soooooort of go into this, e.g. Julia Serano argues that unlike other sexual interests, autogynephilia tends to go away over time, is present in cis women, tends to lead to emotional attachments and is preceded by ideation in childhood. Certainly if autogynephilia differs from most other sexual interests in all these ways, we should pause and reconsider whether it can really be lumped under the same concept, as (2) asserts. However, I think Serano’s claims are wrong and that they either represent things that she made up without a basis, ideas she got from others who made them up without a basis, or ideas that are based on low-quality observations.

(Serano furthermore distinguishes sexual interests into paraphilias and sexual orientations, which further complicates her argument. Outside of etiology, I doubt this distinction is particularly meaningful (and I’m not even sure it’s meaningful when it comes to etiology), so I don’t think this is super relevant to bring up. However, if you want to know which things exactly she associates with paraphilias vs sexual orientations, you should read her paper.)

Anyway, one could argue that autogynephilia-the-construct as usually presented includes two additional assertions:

  1. Autogynephilia is influenced by gynephilia, being caused by an “inversion” of one’s gynephilia, a process called “erotic target identity inversion”, and is only one out of an entire family of erotic target identity inversions.
  2. There are no other causes of autogynephilia than ETII.

There’s a big critique that can be made of (5), but there’s also something that can be said about (4). So let’s start with (5) and continue to (4) afterwards.

There are three statements that appear to be in conflict. The first is (5), that autogynephilia is always an inversion of gynephilia. The second is (1), that autogynephilia-the-construct is the cause of autogynephilia-the-phenomenon. The third is the observation that some exclusively androphilic and asexual individuals exhibit autogynephilia-the-phenomenon.

The classical solution to these conflicts is developmental competition and meta-attraction (some autogynephiles have in a sense inverted their heterosexuality; they are attracted to being with men, but only as a woman). This is fine as far as it goes, but meta-attraction cannot account for all autogynephiles’ interest in men, and at this point I’m pretty convinced that at least some exclusively androphilic males exhibit autogynephilia-the-phenomenon. (I am not sure about the situation for asexuals/anallosexuals.)

There are two possible solutions to this. One is to drop (5), and say that there exist other forms of autogynephilia, perhaps one associated with androphilia. The other is to drop (1), and say that there are things other than true autogynephilia which can cause phenomena similar to autogynephilia-the-phenomenon. As androphilic AGPs appear to have a similar degree of gender issues to gynephilic AGPs, and as their sexual fantasies look relatively similar (but not entirely identical), I favor dropping (5). But one could also drop (1) instead, and replace it by the assertion that true autogynephilia is only one of the possible causes of autogynephilia-the-phenomenon. These options lead to two genuinely distinct constructs. Dropping (1) leads to what I call “narrowsense autogynephilia”, as it (hopefully! assuming the construct of autogynephilia/ETII is valid) refers to the narrowest, classical notion of autogynephilia, representing an inversion of gynephilic attraction. Meanwhile, dropping (2) leads to what I call “broadsense autogynephilia”, as it refers to autogynephilic phenomena in the broadest sense that is coherent. I expect it to be possible to relate both kinds of autogynephilia in a single model, e.g.:

r/Blanchardianism - Multi-type AGP hypothesis
Diagram relating different kinds of autogynephilia. “AHE” and “meta” here refer to sex as a woman with women and men respectively.

Now, this next part might be considered a bit pedantic, so feel free to skip to the next section as it’s not super important for the overall point of the blog post. But basically, one thing to consider is that (4) is underspecified. This comes into play when one considers other sexual interests that are also proposed to be due to this inversion, such as furries, apotemnophiles, and so on. Namely, these other interests raise the question, is the process that moderates the gynephilia 鈫 autogynephilia effect, the same as the process that moderates the other effects? Something akin to this?

Under this hypothesis, there is a variable, “erotic target identity inversions”, which moderates the conversion of any sexual interest (gynephilia, attraction to women; acrotomophilia, attraction to amputees) into an autosexual form.

Or is it more a parallel phenomenon, where each inversion plays out independently across different individuals?

Under this hypothesis, any sexual interest contributes to its autosexual inversion, but the contribution is independent across the interests.

These are genuinely distinct hypotheses, and they have distinct interpretations and predictions. I think Blanchardians tend to favor the first one; for instance, in their autopedophilia study, Hsu and Bailey found that among pedophiles, autopedophilia correlated strongly with autogynephilia, which is the sort of thing you’d expect under the former but not the latter study. However, at least to a degree, this observation could also be explained by the presence of a “general factor of paraphilia”, which appears to exist too. In my attempt to test it, I did not find autogynephiles to have a greater correspondence between the traits they would like to have and the traits that they are attracted to than others, which appears to be in contradiction with the former theory and in support of the latter one.

The theory

And then there’s the theory; while the construct represents the causal claims relevant to defining the construct of autogynephilia, the theory represents all other causal claims related to autogynephilia. The theory implicitly relies on the construct, as one cannot talk about the relationship between autogynephilia and other things if autogynephilia isn’t real, but the construct doesn’t rely on the theory in the same way.

It can be hard to define where the construct ends and the theory starts; for instance, if sexual interests are proposed to be motivating, then that makes autogynephilia’s effect on gender issues part of the construct rather than the theory. There’s some definite distinctions; e.g. the claim that trans women are always either autogynephilic or androphilic and never both would be theory and not construct.

(Uhh… except that the main empirical validation of ETII as a concept comes from observing that androphilic trans women are less likely to be autogynephilic. But this is less about ambiguity and more about the validation of the theory being a bit of a garbage fire.)

But for some things, like AGP being motivating, to some degree it might be more a difference of perspective. For instance, part of the construct definition of AGP is that it works like other sexual interests. Thus, if one keeps a focus on these, e.g. by starting with an idea of how sexuality works, and then testing whether this applies to AGP, that could be considered construct validation. However, if one has something one has observed in AGPs, and one isn’t making any claim of it applying to other sexual interests, then this represents AGP theory. More generally, autogynephilia-the-theory takes autogynephilia-the-construct for granted, and seeks to understand how this relates to other things of interest, whereas autogynephilia-the-construct primarily studies the internal workings of autogynephilia.

Does clinical experience constitute construct validation?

Autogynephilia-the-construct has been “validated” in the sense that there are a number of case studies, clinical experiences, and so on that line up with the idea. I’m not really satisfied with that, though.

As far as I can tell, this sort of clinical experience isn’t particularly reliable. It’s easy to come with several examples of this. The assumption, now known to be false, that autogynephiles cannot be androphilic is a clear example of this. Despite the lack of sound theoretical basis, it has also been asserted that autogynephilic transsexuals have no degree of femininity, which has never been demonstrated and appears contradicted by studies (e.g. this). And of course, as I mentioned earlier, the forms of autogynephilia that have been identified by Blanchard do not well represent the forms that autogynephilia actually takes on in typical fantasies.

One can also consider systematic biases. These sorts of data will tend to skew towards particularly severe, ego-dystonic or otherwise unusual (e.g. in this case, transsexual) cases. These cases are unlikely to give an accurate view of how autogynephilia presents in general. (I constantly have to remind people that the general pattern when studying autogynephiles in a less filtered way is “autogynephiles appear remarkably normal”.) It will also tend to lead to a bias towards more-persistent cases, which if we understand persistence to be part of the construct of sexual interests creates a potential bias.

Clinical experience certainly has a lot of advantages too, though. It allows going into much deeper detail than would be viable in quick standardized surveys, it is generally longitudinal, and it allows some degree of nonsystematic experimentation which may give hints to causality. (Though the previously mentioned biases, as well as factors like regression to the mean, makes this kind of experimentation very dangerous.)

Overall I appreciate the value of clinical experience as a source of anecdotes that can be used to guide theory formation, but I can’t see it as a viable reason to consider theories to be confirmed.

The validation crisis in psychology

I believe debates about whether autogynephilia is real should center on whether autogynephilia-the-construct is real. To understand why, suppose autogynephilia-the-construct isn’t valid. This would mean that apparently autogynephilic phenomena are not motivated by an underlying sexual interest. That would certainly be very strange and hard to imagine, but that is precisely why it is so important; the world must function in a very different way than I imagine if autogynephilia is not a valid construct, and so that implies that I must rethink my beliefs a lot in such a case.

It would probably be helpful to consider various alternative proposals to autogynephilia-the-construct:

  • On page 173 of his book, The Man Who Would Be Queen, Michael Bailey quotes a transvestic fetishist who asserts that the arousal to autogynephilia-the-phenomenon is simply normal gynephilic sexuality, while the motivation to engage in it has nothing to do with sexuality, but is instead simply to “feel feminine”.
  • Serano argues that “autogynephilia” ends up covering multiple distinct phenomena, including reactions to dysphoria and societal sexualization of women, or that they are into it for the sake of novelty. These proposals are very different from how I understand sexual interests to work, and so it would be fair to say that under these sorts of hypotheses, autogynephilia would not be a valid construct (or at least, need not be a valid construct).

(It is worth noting that certain critiques of autogynephilia theory rely on autogynephilia-the-construct being valid. For instance, Nuttbrock and Veale have argued that it arises through the process of “Exotic Becomes Erotic” hypothesized by Bem; but this process is intended to explain how sexual interests develop, and so it is only valid to apply if one understands autogynephilia to be a sexual interest.)

If autogynephilia is not a sexual interest, but instead takes the form of something like the above, then certainly a lot of the Blanchardian theories surrounding autogynephilia look quite silly and often nonsensical. Given what I know, I don’t think this is particularly likely, and instead think that autogynephilia is rather obviously a valid construct.


Here’s the problem: Autogynephilia as a construct has not been validated. Oh sure, symbolic steps towards validation have been performed, such as demonstrating that measures of autogynephilia can discriminate between certain groups. But nobody has taken a powerful construct of sexual interests, and systematically checked that autogynephilia fits under this. As far as I know, there hasn’t even been developed a strong theory of sexual interests that it can be validated against.

Unvalidated, weak theories are hardly unique to Blanchardianism; it’d probably be harder to think of constructs and theories that are validated than ones that are not. Certainly the critics of autogynephilia do not fare any better; concepts like “gender identity”/”gender dysphoria”, “exotic becomes erotic”, “sexualization” and so on are all underprecisified and have only symbolic gestures towards validation. But this isn’t limited to gender topics. Things like the “Big Five” personality traits would probably not fare much better here. It’s not even a problem with psychology research; concepts used by people in everyday life, such as impulsivity, are not validated, and I believe that they are often invalid (this has rarely been demonstrated, but for e.g. impulsivity it appears to be the case). Essentially, this is the state of psychological research:

One “solution” would be to simply disregard psychology as a domain. I don’t think that’s viable, because I want to understand how humans work, and I also don’t think anyone is actually going to do this; instead they are just going to rely on theories that are culturally popular, which is hardly an improvement.

Another solution would be to just keep doing what we are already doing. Combine weak theories, anecdotes, and stray thoughts into vague “predictions”, confirm these predictions and keep on piling up ever bigger questionable theories. Use these vague studies to argue with other people who use other vague studies to contradict you. Keep claiming that your unvalidated constructs are better than their unvalidated constructs. Not particularly appealing, I’d say. It’s probably fine to do this as exploratory research until a clearly defined framework has been established, but one should actually make progress towards having a solid framework over time.

Now, I’m absolutely guilty of this too. My only excuse is that I was just acting like everyone else, but that excuse is hardly enough to justify continuing to act like this, so what am I going to do? I’m going to put much more attention to fundamentals related to validity. New goal: validate that autogynephilia is a sexual interest. And more generally, outside of exploratory research, I’m going to pay much more attention to construct validity.

I’ve talked with other researchers (who are more tightly linked to the scientific establishment) about validity, and they seem to see the concerns too. However, they are more tied up in getting funding and publishing papers, so they tend to address more-immediate questions, rather than focusing on the deeper theory. Which, to be fair, can still be useful, but it seems to me that much more value can be achieved by putting things on solid ground. Since I’m not really addressing the importance of funding or the value/harm of publishing invalid research, this proooobably won’t convince them to change up their method of work. But since I am primarily concerned with understanding how these things work, I’m definitely going to change my method.

The way forward

I used to think proper construct validation was nearly impossible, and so disregarded the question. After all, requiring impossible things would demand that I remain entirely agnostic on how psychology works, which is hardly a workable position. Painfully slow and vague progress in invalid psychology is superior to this.

However, more recently, I’ve been realizing that a large part of it is simply due to the theories being too vague to be meaningful. Obviously you can’t test your theory if you don’t make clear predictions. (Or well, often you can disprove the theory because even vague predictions are precise enough to sometimes be proven wrong.)

In particular, for the case of autogynephilia, the claim is that it is a sexual interest. OK, how do we test that? First, specify what sexual interests are. We’re presumably going to want it to cover factors like arousal, behavior, desire, sexual fantasy, compulsions, proto-sexual childhood ideation, and porn use. Probably more. So build a theory of sexual interests that covers this, by studying sexual interests in general. And then verify that this applies to autogynephilia.

It’s particularly easy when it comes to sexual interests because they are proposed to be a system of several parallel phenomena that all function analogously. I’ve previously attempted to explain how that helps, though that explanation wasn’t very clear, but roughly speaking it goes as follows: It’s easy to come up with many incompatible causal theories that fit any one of them, but it’s unlikely that anything other than reality will fit all of them. Thus, studying sexual interests in general will give us a much better understanding of the causality involved than studying them in isolation will. (Assuming “sexual interest” is a valid construct, but then, if it isn’t then neither is autogynephilia.)

Anyone can “predict” that sexual arousal to the fantasy of being a woman is going to be correlated with wanting to be a woman. This is compatible with autogynephilia-the-construct, but it’s also compatible with there being content overlap (being a woman) between the two items, or by reverse causality, or by monomethod bias, or by all the various other theories that have been proposed. (Confounding can also lead to correlation, but without more specific explanations like content overlap, it doesn’t predict correlation.) That things are positively correlated with each other is called a positive manifold, and this is predicted by lots of theories people come up with. (馃 Is this a bias in human theory development?) It’s much harder to predict the quantitative strength of this correlation, and few theories would be able to consistently predict this across many variables.

This relies intrinsically on having a strong theory that makes not just qualitative pseudopredictions about the signs of correlations, but instead makes quantitative statements about the exact strengths of causal effects, from which correlations can be derived. Of course, such a theory must to a degree be fit to data to know the coefficients involved; but by being a more universal theory, we can test it on different interests than it is fit to.

(This helps with inferring causality from correlation. However, ideally this should be supplemented by other causal information. Unfortunately, it is super rare that this becomes viable, as causal inference is hard.)

All of this mainly focuses on the validity of structural relationships in the theory. However, this isn’t the only form of validity that is relevant; also relevant is external validity. So far my plan is to assess these different concepts using self-reported surveys, because this is super cheap. However, that kind of data is of very low quality, which may turn out to be an obstacle both for theory inference, and for validity of the inferred theories. I don’t have a great solution for this yet, though I am on the lookout for ones.

In the hope that it can help convince people of the relevance of performing this sort of research, it’s also worth emphasizing what this can give us: directly constructing and validating a model of sexual interests gives a straightforward research program, which if it succeeds will provide powerful arguments on a variety of questions. Pretty much any theory that can be formulated across sexual interests (“porn exposure causes paraphilias”, anyone?) can likely be examined using these methods. And that’s a lot of theories. Perhaps particularly relevant for autogynephilia is the debate about direction of causality; since presumably sexual interests motivate behavior, it’s a reasonable argument for autogynephilia causing gender issues if autogynephilia is a valid construct.


For evaluating whether autogynephilia is real, we should consider whether there is a sexual interest in being a woman that accounts for apparently-autogynephilic phenomena. So far, it has not been demonstrated that a sexual interest in being a woman exists, and so the claim that autogynephilia is real is hardly on solid empirical ground.

This is part of a broad tendency of psychological theories (whether formal and scientific, or informal) to deal with ill-defined and unvalidated concepts. Thus, those that critique autogynephilia for being unproven are not wrong due to their critique being wrong, but instead wrong to raise isolated demands for rigor on just autogynephilia and not everything else too (including on many of the very concepts used to critique autogynephilia).

I personally still believe autogynephilia is real, but I also have some work before me to actually demonstrate that.

Universal laws are causal inference

Edit: It has come to my attention that I did a terrible job of explaining this. I think it’s very important, but the explanation needs to be improved.

OK, the title might be a bit of an exaggeration, but it’s an effective way of summarizing an amazing piece of insight I’ve been thinking about lately.

Suppose you have a causal system. For simplicity, we’ll say that it contains two variables, A and B. Being a causal system means that one of the variables might affect each other. But how can we tell, from pure observation, which variable is the cause, and which is the effect? That should be impossible, right?

For instance, if A and B each have a variance of 1, and their correlation is 0.5, then that can either be due to the rule:

A ~ N(0, 1)
B ~ 0.5 * A + N(0, 0.75)

(which is to say, where B is determined by a combination of A and random noise; noise is denoted by N(碌, 蟽2), which refers to the normal distribution)

Or it can be due to the rule:

B ~ N(0, 1)
A ~ 0.5 * B + N(0, 0.75)

(where A is determined by a combination of B and random noise…)

These rules give the same observational data, yet are literally opposites. Which poses a problem for causal inference. There are methods of doing causal inference anyway, such as experiments, instrumental variables, and theory, but these are all far too expensive or difficult in many cases. Is there an easier way?

Detour: Some coefficients are unstable across contexts

One of the main cases where you will see this discussed is in genetics. Within genetics, one has what is known as the heritability coefficient, h2, which is generally understood as a quantity that describes how much genetic influence there is on a trait. It is defined to be the fraction of variance caused by genetics.

But by talking about “fraction of variance”, nongenetic factors that increase variance will decrease the heritability. For instance, if you have a number of plants, and you place some of the plants in good conditions and some of the plants in bad conditions, then the growth of the plants will be less heritable than if they were all placed in the same conditions, as there is now extra variance due to the environmental condition. If, on the other hand, the heritability had been unstandardized, if one had talked about just the variance in growth, rather than the fraction of variance in growth, then the condition might not reduce the heritability.

(… or it might. If there are gene-environment interactions or other nonlinearities, as there likely is, then it could very well also affect the heritability. But we will ignore nonlinearities here.)

So one way that standardizing makes coefficients unstable is that they allow downstream conditions to affect the coefficient. To tie this into our previous examples with A and B, even if A causes B at a consistent coefficient of 0.5, the correlation between them is going to vary depending on the noise of B. In the previous example, the causal coefficient matched the correlational coefficient, but if B’s noise had been 0, the correlation coefficient would be 1, while if B’s noise had been 1, the correlation coefficient would be 0.44.

Another way that standardizing makes coefficients unstable is that they introduce a dependence of the upstream conditions. For instance, genetic variance can be lowered in cases of inbreeding, population bottlenecks, avoiding assortative mating, and more. Or in terms of the A/B example previously, if A has lower variance, then the correlation will be lower.

The core asymmetry

Notice an important thing in the above: if A causes B, then variance in A will increase the correlation, while variance in B will decrease the correlation. That’s an asymmetry between A and B! Exactly what is needed for causal inference.

Just to hammer it home, here are covariance matrices for the two causal relations, and two different sizes of variance for A and B each.

Top: Structural equation model diagram which shows the relationships between the variables. eA and eB are the noise terms, with the noise variance being represented by a and b. Bottom: The covariance matrices implied by different values of a and b.

Despite the causal effect being the same in each of the cases, the covariance matrix ends up differing due to the different variances that are introduced. And because of the asymmetry between A and B over the covariance matrices, only this linear causal relationship and not the one in the opposite direction fits the data.

Or to illustrate it in another way, I can generate datasets for each causal direction, for differing variances:

Each circle represents an N=infinity dataset generated by the previously described causal models, with blue dots generated by the A->B model and orange dots generated by the B->A model. The Y coordinate shows the correlation between the two variables in the dataset. The X coordinate shows the relative amount of variance in A and B.

As you can see, while different causal models can overlap observationally, they trace out different curves of possible observational data in the space of covariance structures.

Automagic causal inference

Now this is all well and good, but in reality any dataset is only going to have one noise variance for each of A and B, so how is this useful? This is where the “universal laws” part of the title comes in: if one can make ones theory describe multiple distinct situations, then one could embed the variables A=A'(x), B=B'(x) into a larger family of situations, and require the same theory to apply for each x.

In that case, simply by successfully fitting the theory, as long as the situations are sufficiently distinct, you have much greater confidence in causal validity than you would in a standard case where you are considering only one situation. (It is necessary for the situations to be sufficiently distinct, as otherwise it might fit to all of them through sheer luck.) This is because it’s easy for a wrong theory to accidentally fit a single situation, but hard for it to fit multiple situations.

To give some examples of how that might work:

  • You might wonder if people support some specific political policy because they believe it is beneficial to them. In that case, you could generalize and look at policies in general, examining whether there is support for the general theory that people support policies if they believe they benefit them.
  • You might wonder how people answer a personality item, what influence factors like desirability, memories, actual applicability, etc., have on their response. In that case, you can consider the general theory of how people answer personality items.
  • You might wonder what factors go into creating some kink. Is it porn depicting the kink, traumas surrounding the kink, taboo, etc.? You might also wonder how the kink influences behavior, and in particular whether there is some mediation going on, e.g. does fantasizing increase the likelihood of acting on it? In that case, rather than considering the specific kink, one could consider a general theory of kinks, as this then allows performing causal inference over these questions.
  • And particularly relevant for this blog, you might wonder what makes some people wish to be the opposite sex and what makes some people happy with their sex. And again, here one could embed it into a general theory about how people’s desires to be something specific works.

These are just some beginning examples I’ve thought of, because they are relevant to the topics I’m researching. I would not be surprised if there are analogous examples for other topics, considering how there are so many examples everywhere I look.

(Uhm, though there is one major complication: All of the examples I gave are in the domain of psychology, where measurement error is rampant and correlated, data is ordinal rather than interval/ratio, constructs are dubious and generalizability is unlikely. So it’s pretty relevant to look into how big of a problem these things will be; this is something I’m currently examining in simulations, and I will look at it more in the future.)

What’s interesting to me is that compared to all the other methods of causal inference, this method seem extremely… easy? You don’t need to have a good instrument, you don’t need to carefully look at conditional independences, you just need to look at generalities. And considering how important theory is to do anything practical, you need to look at generalities anyway, so this isn’t necessarily a big restriction. So I feel this is likely a method I will look into more to better understand.

Contra Blanchard and Dreger on Autogynephilia in Cis Women

Some argue that it is not just males who can be autogynephilic, but instead that cis women are also autogynephilic too. In an interview, Blanchard countered:

My own arguments against the claim that autogynephilia frequently occurs in natal females were more general and not directed at Moser鈥檚 survey. I wrote, for example, that the notion that typical natal females are erotically aroused by鈥攁nd sometimes even masturbate to鈥攖he thought or image of themselves as women might seem feasible if one considers only conventional, generic fantasies of being a beautiful, alluring woman in the act of attracting a handsome, desirable man (or woman). It seems a lot less feasible when one considers the various other ways in which some autogynephilic men symbolize themselves as women in their masturbation fantasies. Examples I have collected include: sexual fantasies of menstruation and masturbatory rituals that simulate menstruation; giving oneself an enema, while imagining the anus is a vagina and the enema is a vaginal douche; helping the maid clean the house; sitting in a girls鈥 class at school; knitting in the company of other women; and riding a girls鈥 bicycle. These examples argue that autogynephilic sexual fantasies have a fetishistic flavor that makes them qualitatively different from any superficially similar ideation in natal females.

(Emphasis mine.)

A similar argument was proposed by Dreger:

I鈥檝e talked with Blanchard, Bailey, and also Anne Lawrence about this, and my impression is they all doubt cis (non-transgender) women experience sexual arousal at the thought of themselves as women. Clinically, Blanchard observed autogynephilic natal male individuals who were aroused, for example, at the ideas of using a tampon for menses or knitting as a woman with other women. I have never heard a natal woman express sexual arousal at such ideas. I鈥檝e never heard of a natal woman masturbating to such thoughts.

One might think that before making this argument, Blanchard would’ve tested the relative frequencies of sexual interest in menstruating in autogynephilic males vs female in general, but he didn’t.

At some point I realized, hey, this idea is totally unfounded and probably wrong, so I should test it so we can stop running in circles. Here’s my results:


Bar charts from my porn survey on autogynephilia. Each row represents a different operationalization of autogynephilia. Each column represents a different group that was studied. I will focus on the third row and the second, third and fourth columns for this post. Participants were asked to answer “How arousing would you find the following…?” for a large number of sexual interests, relatively uniformly shuffled together.

I defined autogynephilic cis men as participants who said that they were men, not transgender, and endorsed “A little” or more arousal to “Imagining being the opposite sex”. I defined non-gynephilic cis women as participants who said that they were women, not transgender, and “A little” or less attracted to women, while gynephilic cis women were defined as having “Moderate” or more attraction to women.

As can be seen in the diagram, both gynephilic and non-gynephilic cis women endorsed more arousal to “Yourself menstruation (if you are male, imagining that you were able to menstruate and menstruating)” than autogynephilic men did.

Endorsement from all the groups on this item was extremely rare. This raises the question of how relevant Blanchard’s argument is in the first place, as it attempts to reason about the nature of autogynephilic using an extremely rare manifestation of autogynephilia. But regardless, Blanchard’s argument was not supported.



My survey was very nonrepresentative. I posted it on /r/SampleSize, which is known to have much higher rates of autogynephilia in males than the general population. How this generalizes to female participants is unclear, but it’s probably a good guess that the rates of endorsement are elevated for them too. (Furthermore, one can raise some questions about the validity of the items used.)

This implies that my survey doesn’t really show the real rates in cis women, and so still leaves the problem that we don’t know how high the rates are. The solution to this problem is that Blanchardians should stop making up unfounded arguments that cis women are not autogynephilic. Instead, they should either stop arguing about it, or do what the people who argue that cis women are autogynephilic do and study it directly. (See聽1,聽2,聽3,聽4, and 5.)


I’ve gone through different takes on whether cis women are autogynephilic, ranging all the way from “yes” to “no”. My current take is agnosticism. Is that agnosticism really justified? Shouldn’t the answer be, “no, obviously”?

I notice several deep… “anomalies”, with the claim that autogynephilia is rare in cis women:

  • Ray Blanchard and Alice Dreger use very strange and contorted arguments to argue for it, even though they should know better.
  • Homeovestism, or something very much like it, appears to be common in women.
  • When using scales similar to what Lawrence suggested for assessing autogynephilia in women, one can get exceedingly high endorsement rates.
  • A number of people have claimed publically that autogynephilia is common in cis women to audiences that contain large numbers of women, without any pushback. For instance, Scott Alexander’s post even gave an extremely overt example of what autogynephilia means (so there can’t be much confusion), yet women in the comments didn’t go “hey, that sounds wrong”.
  • I know trans women whose cis female partners have claimed, to the protest of the trans women, that autogynephilia is normal female sexuality.
  • Many who disagree with it, such a gender critical women, seem very openly hostile to research being done on it, as if they were trying to hide the truth, and also attempt to counterargue using contorted arguments like that it is impossible by definition.

Can all of these be explained away? Yes, with some assumptions and legwork. Is “autogynephilia is fairly common in cis women, but some people are opposed to acknowledging it because it is inconvenient” a simple theory that can account for these anomalies without trouble? Also yes.

With these, one could almost ask whether my take on autogynephilias being highly prevalent in cis women should be “yes, obviously”. I still have some concerns that I want to look into before I endorse this though, namely:

  • I think that some of the overtly autosexual things in my list of A*P interests are unlikely to be as common in cis women as they are in cis men.
  • There is still some nonzero concern that cis women are misinterpreting the items given, though this concern is gradually shrinking due to factors that make the intent more clear.
  • Another theory that could well account for many of the anomalies would simply be that I am in a very autogynephilic corner of the world; men on sites like reddit or SlateStarCodex are much more autogynephilic than the general population, so why wouldn’t women be too? So the question is, do all of these findings apply to representative samples too?

I don’t think autogynephilia in women聽necessarily changes that much from a theoretical standpoint. Certainly it聽better allows for some magical innate gender identity theories, but it doesn’t prove such theories. Furthermore, due to women’s low sexual specificity, it doesn’t even particularly challenge ideas like erotic target location error.

I think it would help to not make up arguments without grounding, though.

Contra Serano and Lehmiller on Autogynephilia Prevalence

Serano just published a new review, claiming to “debunk” autogynephilia again. I’m not going to comment on most of it as it is just a repeat of some old and tired arguments, but one part stood out to me:

In addition to cisgender women experiencing FEFs, subsequent studies have shown that many cisgender people experience cross-sex/gender sexual fantasies as well. In a recent study of 4,175 Americans鈥 sexual fantasies, Lehmiller (2018) found that nearly a third of his subjects reported having sexual fantasies that involved being the 鈥榦ther sex鈥, and a quarter had fantasised about crossdressing.

Serano claims that Lehmiller has shown autogynephilic and autoandrophilic fantasies to be common here. However, this is not the case. Lehmiller did not use a representative sample, as he writes in his book:

This book is built around a massive survey of more than 350 questions taken by more than four thousand Americans, including persons from all fifty states. Although the sample is not necessarily representative of the US population, it does consist of an incredibly diverse group of individuals. Participants ranged in age from eighteen to eighty-seven and had occupations spanning everything from
cashiers at McDonald鈥檚 to homemakers to physicians to lawyers. The group included all sexual and gender identities, political and religious affiliations, and relationship types, from singles to swingers.

Rather, he ran his survey on social media:

In total, 4,175 adults age eighteen or older who were current citizens or residents of the United States completed my survey, most of whom had heard about it through a major social media channel like Facebook, Twitter, or Reddit. Given that this was the primary way people learned about my survey, the demographics of my sample tended to skew more toward the average social media user than they did toward the average American. For instance, the median age of my survey participants (thirty-two) was about six years younger than the overall median age in America.3 Likewise, my participants were more highly educated and more affluent than the average American. My survey did not disproportionately attract people of one sex, though鈥攊t was virtually a fifty-fifty split between those who said they were born
male and those who were born female.

Is that a problem? Yes; my experience with doing surveys on social media is that they tend to attract very high rates of autogynephiles/autoandrophiles, compared to what we would expect on the basis of representative surveys.

Because, yes, there are representative surveys on the rates of autogynephilia/autoandrophilia, and they give much lower rates than what Serano writes. To give two examples, this study finds a rate of autogynephilia of around 10%, and this study finds a rate of transvestic fetishism in males of around 3%.

I shouldn’t have needed to say this, but it’s wrong of Serano to ignore representative studies when discussing the prevalence of autogynephilia and autoandrophilia.

Serano also continues afterwards:

Second, the notion that FEFs have the potential to cause transsexuality is specious and not supported by the evidence (Serano, 2010, 2020). After all, almost a third of Lehmiller鈥檚 subjects experienced cross-sex/gender sexual fantasies (Lehmiller, 2018, p. 66), yet the vast majority of these people will never develop gender dysphoria or desire聽 to transition.

This again is a highly misleading argument. While these autogynephiles don’t transition, they have a large change in their affective gender identity (see e.g. this, finding effect sizes from 1.9 to 2.9), making them much closer to being trans than non-autogynephiles. Furthermore, autogynephilia can exist in different intensities and different types, which might also affect things.

In conclusion, one cannot trust Serano to accurately report the state of the evidence on autogynephilia.

A dataset of common AGP/AAP fantasies

Autogynephilia is a sexual interest in being a woman, and autoandrophilia is a sexual interest in being a man. However, what does this mean in practice?

There are a number of ways one can examine this. For instance, there exist many porn/erotica sites catering to autogynephiles, and they have been observed in clinical contexts, with their fantasies sometimes being recorded. However, I worry that these do not necessarily get at typical such fantasies, but instead get at more extreme and unusual variants, due to their greater selection effects.

To solve this, and to get more data on autoandrophilia, I did a survey asking about qualitative autogynephilic and autoandrophilic fantasies. More specifically, on /r/SampleSize I posted a survey titled “Can you look at some porn For Science? Survey #5” which asked about a broad variety of things, mostly of which were not related to this topic. Near the end of the survey, I asked people whether they found it arousing to “Imagine being the opposite sex”, and among those who answered anything other than “Not at all”, I asked the following open-ended question:

Fantasies about being the opposite sex

Optional. Above, you said that you would find it arousing to imagine being the opposite sex. I’m currently studying the nature of sexual fantasies about being the opposite sex, and as part of this it would be useful to know more about what exactly people fantasize about. So: If you were to fantasize about being the opposite sex, what sorts of things would you imagine?

I’m both interested in the scenarios you imagine (e.g. what sorts of sexual actions are in play, what sorts of environment and partners do you imagine, what sort of body type do you imagine having?) and in the perspective of the fantasy (e.g. who is the object of desire in the fantasy, do you imagine things from a first-person view, etc?).

Feel free to add any other information about experiences or feelings that you may consider relevant to this sexual interest. For instance, it would be interesting to know if you had any thoughts about what makes this sexual fantasy feel attractive to you.

About 500 cisgender women and about 1100 cisgender men completed my survey. Out of these, 96 cis women and 203 cis men answered my question about AAP and AGP fantasies respectively. The dataset, along with some extra variables that I thought it would be worth sharing, can be accessed here. (Note that a few of the participants opted not to have their raw answers shared, and so it contains only 290 data points.) In order to give an overview, I’ve run through the fantasies to try and list the most commonly described themes:

Disclaimer: There were a lot of sexual fantasies and I didn’t have a systematic way to code them, and I did it all by hand, so there may be some mistakes in the following list.

  • 33.5%: Heterosexual sex. (57 AGP, 28%, e.g. “I imagine a luxurious hotel with an handsome abd muscular men after a long diner.”, 39 AAP, 41%, e.g. “I mean not to write too porny but I鈥檝e imagined having a dick and having fairly rough sex with a woman.”)
  • 24%: Masturbating. (45 AGP, 22%, e.g. “I imagine fingering myself”, 25 AAP, 26%, e.g. “I’m mostly interested in being able to feel the pleasure of masturbation with a penis”)
  • 20.5%: Homosexual sex. (41 AGP, 20%, e.g. “sex with my current girlfriend”, 20 AAP, 21%, e.g. “Having gay sex with my partner”)
  • 12%: Being dominant/powerful. (7 AGP, 3.5%, e.g. “I imagine myself sometimes as an attractive woman, sometimes as a normal woman, and since men are less picky about who they choose to have sex, just choose someone, invite them over and be dominant with them.”, 19 AAP, 20%, e.g. “Fucking someone while having a penis seems fun and powerful. “)
  • 11%: Implied heterosexual sex (e.g. mentioning “penetration” abstractly). (23 AGP, 11%, e.g. “being penetrated vaginally from a first person perspective.”, 11 AAP, 11.5%, e.g. “thrusting inside of someone’s genitalia”)
  • 11%: Blowjob. (9 AGP, 4.5%, e.g. “I watch reverse blowjob stuff sometimes.”, 17 AAP, 18%, e.g. “thrusting inside of someone’s mouth”)
  • 10.5%: Orgasming/sexual pleasure. (31 AGP, 15%, “World be fascinating to experience orgasms from the female perspective.”, 6 AAP, 6%, e.g. “I would fantasize about what having a penis would feel like. I like to imagine what my partner is feeling during sex.”)
  • 9%: Multiple partners (AGP only). (19 AGP, 9%, e.g. “I would imagine sex with multiple partners at once, giving and receiving, the gender of the partners ismt really important to me but normally if think about the fantasy its me with men.”.)
  • 7%: Caressing/fondling oneself. (26 AGP, 13%, e.g. “Playing with my boobs”, 1 AAP, 1%, e.g. “I imagine touching my strong, firm, well developed muscles and jerking off”)
  • 6.5%: Being submissive/overpowered. (21 AGP, 10%, e.g. “I would be a sexy little slut that gets used in all sorts of kinky ways.”, 3 AAP, 3%, e.g. “But also the reverse. Having a woman have power over me. The main focus is the penis.”)
  • 6%: Used strap on/packer to simulate penis (AAP only). (6 AAP, 6%, e.g. “I鈥檓 a gay woman and have engaged in the above with a strap on as the giving party.”)
  • 6%: Curiosity. (13 AGP, 6.5%, “I’m curious how sex as a woman would feel.”, 5 AAP, 5%, e.g. “Really curious about what it is like to have a penis.”)
  • 6%: Ejaculating (AAP only). (6 AAP, 6%, e.g. “I want to know what it feels like to ejaculate!”)
  • 5.5%: Crossdressing (AGP only). (11 AGP, 5.5%, e.g. “I like female clothing, so my fantasies often have myself and my partner(s) dressing as a woman. Mainly skirts, stockings and pink/purple stuff.”.)
  • 5.5%: Receiving a lot of sexual attention/being desired. (17 AGP, 8%, e.g. “seeing what it’s like to be a girl and receive all the attention”, 3 AAP, 3%, e.g. “Getting a lot of pretty girls that want to have sex with me, and being able to pleasure them just with my own body.”)
  • 5%: Casual sex (AGP only). (10 AGP, 5%, e.g. “I do fantasize about being a woman and how much of a slut I would probably be.”.)
  • 5%: Specific body characteristics (mentions concrete characteristics). (7 AGP, 3.5%, e.g. “Having smallish tits”, 7 AAP, 7%, e.g. “I (usually) imagine myself as a skinny man with the face similar to my actual face but having a beard.”)
  • 5%: Attractive body characteristics (e.g. fit). (17 AGP, 8%, e.g. “Everyone in my fantasies are healthy and generally fit.”, 2 AAP, 2%, e.g. “I would imagine being a fit man and pleasuring a woman from a first-person view.”)
  • 4.5%: Overall body size (small for AGP, big for AAP). (8 AGP, 4%, e.g. “I am a small woman who gets fucked in the vagina by a large man.”, 5 AAP, 5%, e.g. “I imagine being bigger than whatever partner I have. Being so big and tall that I can hug them and practically engulf them.”)
  • 4.5%: Stronger orgasms. (8 AGP, 4%, e.g. “I’m interested in how it would feel. Women supposedly have stronger orgasms, and it’s sensations I as a man don’t normally (or at all feel).”, 3 AAP, 3%, e.g. “I don鈥檛 know what it would feel like to have sex with that sexual organ. That means I can imagine it feeling better than anything I鈥檝e ever experienced.”)
  • 4%: Only mentions a sexed characteristic and nothing else in the fantasy. (AAP only) (4 AAP, 4%, e.g. “having a dick”)
  • 4%: Anal sex. (5 AGP, 2.5%, “anal (not painful)”, 5 AAP, 5%, e.g. “my penis swinging while being anally penetrated.”)
  • 4%: Using sex toys. (13 AGP, 6.5%, e.g. “Using a vibrator / dildo”, 1 AAP, 1%, e.g. “I fantasise about using a fleshlight or fucking a man in the arse.”)
  • 4%: Answers that didn’t give any specific info or said that they did not have any A*P fantasies. (11 AGP, 5.5%, e.g. “I don’t know”, 2 AAP, 2%, e.g. “Literally everything just to see what it’s like as a man “)
  • 3.5%: Imagining being androgynous (e.g. GAM, …). (2 AGP, 1%, e.g. “I usually fantasize about being a woman while still having a dick.”, 6 AAP, 6%, e.g. “body type, like my own (I still have breasts as well) but with an average to large sized penis.”)
  • 3.5%: Cunnilingus. (6 AGP, 3%, e.g. “I think about having someone perform oral sex on me.”, 4 AAP, 4%, e.g. “I imagine going down on or fucking a woman”)
  • 3.5%: Transforming (AGP only). (7 AGP, 3.5%, e.g. “While I consider myself masculine, I am still very much attracted to femine features and actions, and would find becoming an attractive woman to be very arousing.”.)
  • 3.5%: BDSM (AGP only). (7 AGP, 3.5%, e.g. “The fantasy scenarios vary but generally revolve around some form of bondage, as I personally find female bondage infinitely more attractive than male.”.)
  • 3%: Clothing (AGP only). (6 AGP, 3%, e.g. “wearing sexy outfits and nylons, wearing dresses and heels”.)
  • 3%: Sex with someone genderbending (e.g. drag queen, GAM, …). (6 AGP, 3%, e.g. “Sometimes I just imagine being the opposite sex in a solo fantasy where I’m jerking off while enjoying my body, other times I imagine that my (female) partner had a dick and would penetrate me with it. “, 3 AAP, 3%, e.g. “Also sometimes I fantasize about being a man and having sex with a dragqueen.”
  • 3%: Easier orgasms (AAP only). (3 AAP, 3%, e.g. “I think sex as a man is easier to reach orgasm and I like to imagine what it would feel like to have that easy stimulation.”)
  • 3%: Sex (partner’s nature unspecified). (7 AGP, 3.5%, e.g. “Masturbation and having sex”, 2 AAP, 2%, e.g. “I think it would be interesting to experience sex with a dick.”)
  • 3%: Ejaculate (AGP only). (6 AGP, 3%, e.g. “And feeling them cum in my pussy, ass, and mouth as well. I’d also like to be came on.”.)
  • 3%: Focus on partner. (1 AGP, 0.5%, e.g. “In this scenario I believe I would be more aroused by my partner than the act itself.”, 5 AAP, 5%, e.g. “The most arousing part is imagining the woman’s pleasure rather than my own.”)
  • 3%: Mimicry-A*P. (5 AGP, 2.5%, e.g. “I mostly fantasize about being hot women I see on Instagram or in real life”, 3 AAP, 3%, e.g. “imagine what it would be like to be a man I鈥檓 attracted to fucking a girl I鈥檓 attracted to. “)
  • 2.5%: Multiple orgasms (AGP only). (5 AGP, 2.5%, “Multiple orgasms are interesting. “)
  • 2.5%: Feeling sexually attracted to someone. (2 AGP, 1%, e.g. “They have beautiful body parts and can really get into “the zone” when aroused.”, 4 AAP, 4%, e.g. “The amount of attraction I have towards a woman, like I feel like men would have more primal, untamable urges.”)
  • 2.5%: Being attractive (AGP only). (5 AGP, 2.5%, e.g. “I just feel like I’d be more attractive as a girl.”.)
  • 2.5%: Exhibitionism. (6 AGP, 3%, e.g. “Imagine initiating sexual situations including public sex”, 2 AAP, 2%, e.g. “fantasies: usually in public, with people watching. I’m the object of desire. 3rd person view.”)
  • 2%: Exaggerated sexual dimorphism. (5 AGP, 2.5%, e.g. “she is busty (d-cup+), BMI around 25-30, big ass, trimmed not shaved, glamorously beautiful, vulnerable eyes, exposed vulva.”, 2 AAP, 2%, e.g. “I imagine myself with a large penis and a muscular, vascular body with a beard.”)
  • 2%: Intimate/loving sex. (3 AGP, 1.5%, e.g. “The scenario changes according to my mood and but can include having passionate sex with someone I love.”, 2 AAP, 2%, e.g. “Even though I assume the male role in those fantasies, it鈥檚 usually from a 3rd person perspective, and boringly romantic, as opposed to anything too racy or kinky.”)
  • 2%: Being normal. (5 AGP, 2.5%, e.g. “In casual gay sex encounters I prefer strict top/bottom or dom/sub roles, usually but not exclusively with me being the bottom/sub. As a female in a straight male/female casual sex encounter it would be easier not having to navigate who is in what role (I am aware women can be dom but I would not be interested in that).”, 1 AAP, 1%, e.g. “I’m lesbian, and as progressive as the world is, it just seems easier to imagine being able to talk to women as a man rather than a woman. It’s more of the fear of being gay in my environment (the south and conservative parents) that have me imagine being a man having sex with a woman.”)
  • 2%: Fantasy comes up in dreams. (1 AGP, 0.5%, e.g. “I recently had a dream”, 3 AAP, 3%, e.g. “This mostly comes up in my dreams.”)
  • 2%: Acting flirtatiously. (5 AGP, 2.5%, e.g. “imagine teasing and turning on the opposite sex”, 2 AAP, 2%, e.g. “Sometimes I imagine I’m single and try to pick up a woman to have sex with.”)
  • 2%: Watching one’s own body. (7 AGP, 3.5%, e.g. “Haven’t really thought about it much. I was more thinking of the hypothetical “if i was a girl for a day I’d just play with my boobs in front of a mirror” thing lol”, 1 AAP, 1%, e.g. “I imagine in it third person, but like watching myself.”)
  • 2%: Impregnation. (3 AGP, 1.5%, e.g. “being impregnated”, 2 AAP, 2%, e.g. “imagining creampie-ing a woman and getting her pregnant”)
  • 2%: Merging or swapping bodies (AGP only). (4 AGP, 2%, e.g. “So in this dream, we decided to switch bodies so that we would have to meet again later to switch back. […] During sex I enjoy being very intimate, intertwined (literally sharing as much skin surface as possible and sensing breath and pulse) and feeling what the woman feels and I love the way women experience arousal and sex, so becoming her or merging bodies would be the next level in this.”
  • 1.5%: Attracting straight people (AGP only). (3 AGP, 1.5%, e.g. “Having sex with straight men that I’m attracted to.”.)
  • 1.5%: Rape (AGP only). (3 AGP, 1.5%, e.g. “often forced male-on-female”)
  • 1%: Friendships becoming sexual (AGP only). (2 AGP, 1%, e.g. “Friendships turning sexual. slender lesbian top. I read a lot of yuri romance and thus have an unrealistic idealized fantasy about lesbian romance and sexuality”.)
  • 1%: Masochistic emasculation fetish. (2 AGP, 1%, e.g. “I’d also like to be a cuck and watch as a man cum in my wife so I can eat out her used pussy.”, 1 AAP, 1%, e.g. “As previously mentioned, I’m into orgasm denial and chastity. This kink is logistically more feasible with dicks instead of cunts.”)
  • 1%: Extreme masochism (e.g. slavery, brainwashing, …). (2 AGP, 1%, e.g. “I could be sold into sexual slavery, forced to perform sexually. Maybe I’m trapped in a machine that forces orgasms. Maybe a mysterious monster is magically draining my intelligence and simultaneously stimulating me to keep me from resisting. After their torment, I’ll be reduced to a brainless fuckable objectified being, which is one of my fantasies for a partner as well.”, 1 AAP, 1%, e.g. “usually he’s a vampire for the purpose of being able to torture him more without him dying, because dying isn’t sexy. He doesn’t have sex with girls unless they rape him, and he never enjoys it. Most of the scenarios don’t involve sex at all though. There’s a lot I haven’t said, but it’s embarrassing.”)
  • 1%: Peeing (AGP only). (2 AGP, 1%, e.g. “Sexual touching and peeing”.)
  • 1%: Everyday activities. (3 AGP, 1.5%, e.g. “I typically imagine myself just being female in my day to day life”, 1 AAP, 1%, e.g. “I imagine waking up in a man’s body and spending the day as a man.”)
  • 1%: Feet (AGP only). (2 AGP, 1%, e.g. “I like girls feet so I have thought about what it would be like to be a girl with cute feet and tease guys with foot fetishes”.)
  • 0.5%: Voyeurism (AGP only). (1 AGP, 0.5%, e.g. “Getting to see inside of a women’s locker room / changing room”)
  • 0.5%: Corsets (AGP only). (1 AGP, 0.5%, e.g. “Corsets/extremely small waists”)
  • 0.5%: Watching porn (AGP only). (1 AGP, 0.5%, e.g. “If I were to engage in a sexual fantasy involving me becoming a woman, I would only indulge in solo sexual acts, such as masturbation or watching porn.”.)