Autism as a sparsity prior

[Epistemic status: Speculative. Testing these ideas quantitatively seems hard.]
[Writing status: I have no idea whether this is even halfway comprehensible. I’m so deep stuck in decision theory and causal inference and psychiatry that I’m not sure what is obvious to others, and what is in need of further explanation.]

I’m autistic, but I’m also kind of confused about what autism even is. Apparently in some contexts I come off as being very autistic, while in other contexts I don’t come off as autistic at all. However, I’ve got an idea that helps me make sense of some things:

Autism is supposedly “characterized by difficulties with social interaction and communication, and by restricted and repetitive behavior”, and tends to involve rigidity and obsession. A common approach seems to place it as one end of a spectrum of mechanistic vs mentalistic thinking, with schizophrenia at the other end; this approach is summarized by Scott Alexander in his blog post on the Diametrical Model Of Autism And Schizophrenia.

I’m building on that model, but I have some thoughts or concerns about it. It relies on the idea of “mechanistic” thinking (or in related models, “systematizing”, “if-then-else” thinking). But what is that? And why would it “trade off” against mentalistic thinking? (I think I have a reasonably good understanding of what mentalistic thinking is, namely as an agency prior. More on that later.) This is where my idea comes in.

Mechanistic thinking as a sparsity prior…

… would probably have been a more accurate title to the blog post, but that wouldn’t make the connection to autism as clear. To understand what this means, we need to consider what a prior is more generally, and what a sparsity prior is specifically.

“Prior” is short-hand for “prior distribution”; it’s a mathematical object used in Bayesian statistics which describes the theory in which one interprets the data one encounters. It turns out, you can’t interpret data without theory, no matter what you do; even something as simple as extrapolating from the past into the future relies on the theory that the past will resemble the future. The prior formalizes what exactly the assumptions you make are.

A sparsity prior is, in a sense, a formalization of the law of parsimony. Sparsity priors assert that it is more likely for a system to work in a simple way than in a complex way. The overwhelming majority of ways a system can work has “everything interacting with everything” (well, rather, most things interacting with most things); sparsity priors assert that this is implausible because there are so many interactions, and so rules out the overwhelming majority of possible explanations.

To see how this relates to mechanistic thinking, let’s consider some examples:

  • “If-then-else” thinking involves picking some key factor (the “if”) and making decisions on the basis of this. This makes sense only if there is some key factor that matters much more than anything else, i.e. if the system is sparse.
  • We tend to think of mechanistic systems as inanimate. What does that mean? One element of inanimacy is that they are not proactive; they don’t go out and modify the world in all sorts of ways. They just sit there. This is sparsity; they don’t have much influence on anything else.
  • Another element of inanimacy is that they are not reactive. They only have a limited, fixed set of ways to interact. Yes, a rock can be thrown, and it can fall to the ground, but it can’t really do much beyond that. That is, inanimate systems cannot be affected in subtle and complicated ways, only in simple and well-defined ones, which is a sort of sparsity.
  • Mechanistic thinking involves a sort of decontextualization; it tends to assume that things will work the same each time. This partly relies on sparsity, in that it assumes there aren’t a number of additional moderating variables that changes the system’s behavior. (It also relies on symmetry, in the sense that it requires duplication of the system across different contexts, such as over time or over space. Symmetry could likely also be considered a prior, but it is not the same prior as a sparsity prior.)

Generally, the pattern is that a sparse prior is rigid. It assumes there is one simple explanation going on, and cannot deal well with exceptions. It fails badly in cases that are not sparse. One such case is when dealing with interpersonal interactions, and agency in general. Agents try to absorb as much information as possible from their surroundings (e.g. by looking around and seeing things), building a model of what is going on, and they try to massively modify the world to suit them (e.g. building cities, reproducing to have a massive population). These are very non-sparse dynamics, as they require massive amounts of causal exchange, both in and out, in order to regulate the world effectively. Ultimately, any prior only works as well as the environment corresponds to its assumptions, and agents don’t correspond great to a sparsity prior, making it malfunction badly on them.

Relationship to the diametrical model

I think that’s a reasonable starting point; it seems like a sparsity prior accounts for the rigidity involved in mechanistic thinking well. So that leads to the followup question of, if autistic people have a sparsity prior, do allistic people have a density prior? And the answer is no: Most possible models are dense. Therefore, assuming the world is dense does not meaningfully narrow down the possibilities enough to let you infer anything about how it works.

As I see it, for the diametrical model, the relevant alternative to a sparsity prior is an agency prior. Under an agency prior, you assume that the world contains individuals who have goals, beliefs, and who act according to these to influence the world.

To illustrate the contrast, suppose you come across a tree that is fallen over. You don’t know how it has fallen over; the main possibility you can think of is the wind, but it seems hard to imagine that the wind is strong enough to knock down a tree. Density priors, sparsity priors, and agency priors interpret this scenario in three different ways:

  • If you have a density prior, anything might explain it. Maybe an elaborate Rube Goldberg mechanism knocked it over. Maybe god did. Maybe it’s not knocked over but it’s just an optical illusion. Maybe it grew fallen down. How could you know? After all, anything is possible in this world.
  • If you have a sparsity prior, it’s pretty unlikely that it could be anything other than the wind. After all, that would require some new effect that you hadn’t considered to explain it; but that seems pretty unlikely. The world is simple and doesn’t have all sorts of complicated effects. It must have been a very strong wind, or a very weak tree, or some combination of the two.
  • If you have an agency prior, you still don’t know how it happened. But you do know one thing: Someone must have wanted it to happen. After all, trees aren’t usually fallen over; such a special scenario requires an explanation, and the obvious explanation is that someone did it because they wanted to.

A pure agency prior seems reminiscent of schizophrenia, especially explaining the conspiratorial aspects, as well as the assumption that there is a deeper meaning behind every little thing one encounters. One could imagine that normal people have a balance between agency and sparsity priors, while autistic people have a skew towards applying sparsity priors, and schizotypal people have a skew towards applying agency priors. Seeing autism as a sparsity prior is thus not an explanation of how autism differs from allism (that would be via lack of agency prior), but instead an explanation of autism on its own terms, without reference to social cognition.

Deriving autism from sparsity priors

I will now go through some symptoms I see online characterizing autism, and discuss how they related to sparsity priors.

Autistic people are said to do worse at social problems. To an extent, this probably follows simply from being different; anyone who is sufficiently different is going to have trouble dealing with others. But if autism is a dysfunction in agency priors that leads to application of sparsity priors in cases where they are ineffective, then that too should lead to social difficulties.

Humans are agentic. One characteristic of agency is trying to do everything possible to optimize one’s goals; in communication, this would involve trying to optimize every part of one’s message, from phrasing to body language. If this is interpreted from an agentic perspective, then one will ask “given the whole picture, what is this person trying to tell us?”; on the other hand, if it is interpreted from a sparse perspective, then one will focus on the clearest specific things, likely ending up overly literal, and ignoring context and subtle signals. More generally, it will not even occur to someone with a sparsity prior that there are subtle things that they are missing; after all, that is part of the sparsity prior.

This also goes the other way; if you have a sparsity prior, then you will assume that there are not many things that are relevant for how to optimize your message. You may end up blunt, focusing purely on the message, and not taking into account the side effects that sharing the message may have. Meanwhile, with an agency prior, you would to a greater extent take into account the social implications.

Autistic people are characterized by obsessions and restricted interests. This might also make sense in terms of a sparsity prior; if you determine that there is some factor that is important, then it makes sense to learn everything you can about that factor, as it likely accounts for a big fraction of everything that might ever be important about anything (by the assumption of sparsity, that there are not that many factors that influence things). On the other hand, if someone else is interested in something, then an agency prior would tend to infer that this other thing must also be important and worth looking into; and thus lead to broader interests.

There are some autistic characteristics where I don’t quite understand what they refer to; for instance ritualistic behavior. Certainly it seems like sparsity should imply some forms of ritualistic behavior; due to the assumptions that there are only a few key ways that things work, a sparsity assumption would imply that one could end up focusing excessively on some parameters when making decisions, seemingly “ritualistically” ignoring alternative possibilities. I can sometimes recognize this from my own decisions, where in retrospect I have ignored many factors in favor of a single clearer factor. So perhaps sparsity can explain autistic ritualistic behavior too.

One needs to be careful here, with the point mentioned in the previous paragraph. Allistic people also have a sparsity prior, because you need a prior to act. However, I think it comes down to the social element; if there is even the slightest social encouragement to do something in a different way, then an agency prior will assume that there is some good reason for that, and adjust. This will still lead to ritualistic behavior in cases where everyone socially agrees on it, but since everyone is used to this, it is not noticeable. (Unless you start doing anthropology, at which point you realize that humans are very ritualistic in general.)

This ends up important to take into account when one then starts analyzing other factors. For instance, consider sensory overload. It would make sense that if you assume that only a few key factors are important, you would avoid “noisy” (not just audibly but also visually and through other senses) places, due to being unable to figure out what those factors are when there is so much noise. But this shouldn’t be limited to autism, it should apply to allistic people too. I don’t know whether there is a social explanation here, like there is for ritualistic behavior.

Another characteristic with similar problems is that autistic people tend to be clumsy. I think clumsiness can easily be tied into sparsity. Per Moravec’s paradox, basic things like moving your body are much more difficult than humans usually observe, due to needing to constantly optimizing your movements and keep track of every little thing. It would make sense that sparsity would tend to lead to “robotic” and clumsy movements in such a case – but since everybody, not just autistic people, need a sparsity prior to make sense of everything, it’s hard to see how this explains why autistic people specifically are clumsy. And here I also have trouble coming up with a social explanation.

One thing I’m still not sure how to derive is stimming; making repetitive movements. I only have a relatively limited form of stimming, consisting of tapping my feet sometimes. I don’t see any way this fits into a sparsity prior. Maybe there’s a subtle thing related to neuroscience or gathering information or something, but for now I will just consider this to be unaccounted for in the theory, indicating that there is a flaw.

I’m also not sure how to derive comorbidities like ADHD.

Dimensions of sparsity prior

Autistic people aren’t entirely lacking an agency prior, and allistic people aren’t lacking a sparsity prior. This raises the question, in what sense exactly can autistic people be said to skew towards a sparsity prior? More generally, why do the priors trade off against each other?

Let’s start with the second question. It might seem mysterious that mechanistic and mentalistic thinking have a tradeoff. From the point of view of priors, the answer is that they don’t really have very much tradeoff. There are some limits to it, in the sense that you only have so much probability mass to give out, so eventually you do run into a tradeoff. However, it seems like under ordinary circumstances, one would quickly figure out which model is appropriate, and apply that.

The apparent tradeoff probably arises from the fact that you need some sort of model. “The density prior” isn’t useful, as it lacks predictions; so there is some relatively limited set of models, of which sparsity and agency are two of the most obvious ones. The things you don’t model mechanistically must be modelled in some other way, and this main other way would be mentally. Hence an apparent tradeoff.

When it comes to the question of in what sense autistic people skew towards sparsity, I think the core of it is just that the agency prior among autistic people works worse, so they use it less. I.e., autism is literally the same thing as cognitive difficulties with social things (low “emotional intelligence”, except plausibly emotional intelligence tests aren’t good at measuring the agency prior). If you struggle with modelling things in agentic ways, you will end up modelling them in mechanistic ways instead, regardless of how appropriate that is. But it’s worth noting that quality of prior isn’t the only “free parameter” to consider:

A sparsity prior generally has a free parameter describing the degree of sparsity it assumes. You might think of it as being akin to a dimensionality measure. If you have a bunch of dominoes standing in a line, they can affect their neighbors, such that one of them falling over can influence the entire line, transitively. Meanwhile, if the dominoes are more spread out, then you won’t get these sorts of chain reactions, but will instead see the effects dissipate. So on one end of the spectrum, a limiting case of a sparsity prior is the assumption that everything might affect everything, while on the other end of the spectrum, a limiting case is that everything is isolated and can’t affect anything else.

Finally, one free parameter is the “hyperparameter” that tells you how often a sparsity prior is appropriate to use, compared to any other prior you might have. This is essentially about whether your immediate instinct is to think of things in a mechanistic way, or in an agentic way (or in some other way, assuming there are other priors of interest). This should in principle only have a limited effect on things; one should quickly be able to recognize when a situation calls for something else, other than a sparsity prior. Possibly, this might explain why engineers can seem more autistic; their “default mode of operation” ends up being autism-like, but they can adjust if they get a bit of time.

I don’t know if all of these exist in the brain. (Heck, I don’t know if any one specific of them exist in the brain; all of this is speculation.) But they seem a lot more concrete to me than “if-then-else thinking”, “systematizing” or “mechanistic thinking”.

Relationship between autism and masculinity

One theory of autism that has come up is the Simon Baron-Cohen’s “Extreme Male Brain” theory, which asserts that men think more “systemizingly” (basically mechanistically) and women think more empathetically, and that autism is due to a skew towards the male side of things. As support for that, it’s generally noticed that more males are diagnosed as autistic, that autistic people have a more male systemizing-empathizing profile, and so on.

In the past, I have had trouble with this theory. One element is that SBC’s measures don’t reaaallly seem to overlap all that much with autism. And autistic men don’t seem to be all that masculine. Really, this sort of thing has also made me skeptical about the diametrical model in the past, and about the validity of “autism” as a category in general. But now I’m writing a huge blog post on it, so clearly I’m going to have some new thoughts on this:

I think men are interested in things and women are interested in people, while autistic people are bad at people and schizotypal people are bad at things. But experience trades off against priors; getting more experience can to some degree develop skills that compensates for lack of ability. So if you’re autistic, but you are interested in people, then you are much more likely to learn to compensate, and that leads to female autists being less likely to be diagnosed. Unless they happen to have masculine interests, in which case they don’t develop their people skills. This seems to match what some people say about female autists “masking” their autism to make it less visible.

This model does introduce an ambiguity; do we define autism by the prior (possibly leading to equally many male and female autists?), or by its consequences? Probably the answer here should be that we shouldn’t get too hasty in applying speculative theoretical models, and that we should wait with using the prior as a definition until we have actually validated that this is an accurate theory of autism. Which it might very much not be.

Conclusion

For a while, autism hasn’t really made sense to me. This model makes it make a lot more sense to me, though I have no clue whether it’s right; I would encourage anyone to critique it. The model essentially just boils down to “autistic people are bad at social stuff” though, which is obvious enough; I guess the nonobvious claim of the model is about what is left in reasoning after removing the social stuff. And I don’t really think sparsity priors are the only thing left over; rather, I think they (or something like them) can be seen as a more formal way of understanding what “mechanistic thinking” is. But mechanistic vs mentalistic are not the only forms of thinking, and so there’s lots of things left over.

One big bottleneck for further development I see is a need to create something that measures it. This would probably be a form of intelligence test. I think current “emotional intelligence” tests don’t really focus much on ability to reason about agency vs sparseness per se, but instead on surface-level “content” vaguely associated with these things. Possibly problems akin to the “tree that has fallen over” situation that I described before might be relevant.

It might also be entertaining to ask whether other things can be mapped into this sort of “prior” viewpoint. For instance, there’s the concept of some person’s or place’s “vibe”; this seems non-sparse, but also not necessarily agentic. Certainly in some cases it might be agentic, but it more generally it just seems similar to principal components regression. Roughly speaking, one can understand the assumption behind this as being a combination of sparsity with latent variables; that is, rather than assuming that everything is observable, one assumes that there’s some important unknown unobserved variables that influence many things; these are then estimated using the “vibe”.

I also can’t help but notice that I’m very interested in mechanistic-like theories of agency. Artificial intelligence, decision theory, psychology, and so on. I wonder if learning such theories to a sufficiently advanced degree can function as a sort of self-treatment of autism. 馃し

Controlling for the general factor of paraphilia

Almost all sexual interests are positively correlated. Going even further, almost all paraphilias are positively correlated, and typically more closely with each other than with normophilic sexual interests. To give an example, from a survey run by the creator of /r/AskAGP:

Correlation matrix from a select set of items from the survey. Each cell shows the degree of association between two sexual interests. For a visual intuition of what the quantities in the cells mean, consider using this correlation visualizer.

When trying to understand the structure of sexual interests, this presents a problem, because an important method of understanding it proceeds by looking at the pattern of correlations. But when everything correlates with everything, it is difficult to interpret this pattern.

The pervasive correlations could be interpreted as representing a “general factor of paraphilia”; that is, some underlying factor that contributes to all abnormal sexual interests. Since paraphilias and ordinary sexual interests also to a degree correlate with each other, one might interpret it as to some degree reflecting something that paraphilias and ordinary sexual interests have in common, such as libido. However, since paraphilias are more strongly correlated with each other than with ordinary sexual interests, it likely also represents something else other than libido. One possibility for this something else might be that there is some common error or set of errors in the development of one’s sexuality that can contribute to all abnormal sexual interests. Another possibility is that it represents “method factors”; i.e. maybe to a degree it is due to people who are open to admitting to one abnormal sexual interest also being more open to admitting to other abnormal sexual interests.

But regardless of the reason for the presence of this general factor of paraphilia, it would be nice to have some way to control for it. One way of controlling for it is to assume that each paraphilia is influenced to some specific degree by the general factor; this influence is called the factor loading of the paraphilia. We can then fit a statistical model that estimates the factor loadings of the paraphilias, and use this model to distinguish between the variance and correlations due to the general factor, versus the variance unrelated to the general factor:

Decomposition of the correlation matrix into aspects that come from the general factor, and aspects independent of the general factor. The mathematical details for how I computed this decomposition are available at the end of the post.

In order to understand things better, we can zoom further into the residual matrix. This matrix represents the correlation structure after taking the general factor into account:

Residual matrix. The diagonal shows the fraction of variance in each sexual interest that is independent of the general factor. The off-diagonal shows the correlations of sexual interests above and beyond that due to the general factor.

This kind of matrix should give a clearer idea of the true structure of the paraphilic interests. For instance, we can see that being a furry is correlated with attraction to animals, as well as with AGP; this is predicted by the theory of erotic target location errors, which states that AGP and furryism both represent an “inversion” of attraction to respectively women and anthropomorphic animals onto oneself, such that one is interested being what one would otherwise be attracted to.

On the other hand, we do not see any particular correlation between autogynephilia and masochism. We do see one between forced feminization and masochism, but this is trivial due to content overlap. (Unfortunately, this dataset didn’t include anything asking about transvestic fetishism independent of masochism.) The lack of correlation between autogynephilia and masochism seems to contradict the theory of masochistic emasculation fetish, which asserts that autogynephilia isn’t really a sexual interest in being a woman per se, but instead a fundamentally masochistic interest. Since anecdotal correlations between autogynephilia and masochism are the main thing cited as evidence for MEF, I think this result disproves MEF, at least unless it turns out to have been a fluke somehow. (Incidentally, the creator of /r/AskAGP collected this dataset specifically with the purpose of trying to prove that AGP was correlated with masochism. Turns out it’s not that simple.)

Another interesting thing is that this method makes it more convenient to collect items that measure the general factor of paraphilia well. The issue with just picking any random set of items is that if the items have too much residual correlation, they might not measure the general factor of paraphilia, but might instead measure something more narrow. For instance, if one includes both “Having your partner say insults and/or slurs to you” and “Being humiliated for having a small penis”, one might end up just measuring humiliation masochism. Thus, ideally one would pick items that have no correlations beyond the general factor of paraphilia at all. Inspecting the diagram, that following look like a promising set:

  • Imagining being a woman and caressing your own (female) body
  • Having your partner say insults and/or slurs to you
  • Exposing your genitals to an unsuspecting stranger
  • Sniffing your partner’s underwear
  • Mate-swapping; having sex with someone else’s partner while they have sex with your partner
  • Tying someone up
  • Having sex with someone much older than yourself

Since these items aren’t pure measures of the general factor, they are not going to perfectly measure it, even when aggregated. It might be nice to have an idea of how well they measure it. One such measure is the internal reliability, which estimates the correlation between the general factor and the sum of the items from the internal correlations between the items. The internal reliability of this set of items is 0.64, which is considered to be on the low side; ideally one would find a greater set of varied paraphilia items to create a better measure of the general factor of paraphilia.


I’ll end this post with a brief technical description of the math involved. In order to fit the general factor, I searched for a vector 位 containing the factor loadings, as well as a matrix 惟 containing the residuals. Given the correlation matrix 危, I then searched for solutions to 位 and 惟 such that 危=位位T+惟.

This is an underspecified problem, as one can find a solution for any 位 simply by taking 惟=危-位位T. To make the result well-defined, I picked 位 so as to minimize the off-diagonal elements of 惟. The intuition behind this is that correlations between unrelated paraphilias are presumably due to the general factor, so we do not want 惟 to contain any of these correlations. In order to prevent 惟 from containing this, we simply minimize the off-diagonal elements. More specifically, I chose to minimize the sum of the absolute values of 惟; this should aim to set the median value of 惟 to zero. As long as most of the paraphilias collected are unrelated to each other, this should accurately get at the general factor of paraphilia.

In order to estimate the internal reliability, I used the formula (位0+位1+…)2/((位0+位1+…)2+(1-位02)+(1-位12)+…). Intuitively speaking, the formula consists of two parts, G=(位0+位1+…)2, and S=(1-位02)+(1-位12)+…, such that the total formula is G/(G+S). G and S each represent a fraction of the variance in the “sum score” of the paraphilic interests. Specifically, G represents the variance due to the general factor, while S represents the specific variance for each paraphilia (which, when trying to measure the general factor, we think of as being measurement error). The G variance grows quadratically with the number of items, while the S variance grow linearly with the number of items, so this means that as one increases the number of items, the sum score will to a greater degree represent the general factor compared to the specific variance; that is, more items means less measurement error.

The mean factor loading for the restricted set was 0.45. If additional sexual interests continue having factor loadings like this, the reliability should become adequate with about 10 items, good with about 16 items, and near-perfect with about 40 items. One project I would like to see completed would be to collect more items to better measure the general factor of paraphilia.

Causality is essential: Reply to MTSW on Autogynephilia

About four years ago, Mark Taylor Saotome Westlake published “Reply to Ozymandias on Autogynephilia“, responding to Ozy’s “On Autogynephilia” blog post. In it, he argues that it is solely the correlation structure between surface-level observations that is relevant in science and creating typologies. Needless to say given my recent posts, I disagree.

Sexual arousal to fantasies of being a woman is usually thought of as reflecting a sexual interest in being a woman; whose cause is unrelated to a desire to be a woman, but which has the consequence of generating a desire to be a woman, through means analogous to how other sexual interests work. (See Is autogynephilia real? The phenomenon, the construct, the theory for more details.)

Ozy questions this conceptualization. The specifics of this questioning is unclear, as they do not lay out much detail in it, or much evidence for it. If I had to attempt to parse it, it would seem that Ozy is distinguishing “sexual interests” into “fetishes” and “attractions”, such that e.g. ordinary heterosexuality would be an “attraction”, but “true autogynephilia” would be a “fetish”, and where fetishes only lead to desires for actualization within narrow sexual contexts. In addition to “true autogynephilia” they then claim that arousal to fantasies about being feminized can be a manifestation of gender issues through a variety of mechanisms. I obviously doubt these ideas, but this blog post is not for criticizing them.

Rather, it’s for criticizing MTSW’s response. He argues:

In what way聽are those conceptually different things? You’re describing a.m.a.b. people engaging in what at least superficially聽seems聽like the same behavior, jacking off to the same porn and having the same fantasies. For the ones who might consider transitioning, you say that the erotic behavior “may be a manifestation of gender dysphoria” although it’s “unclear […] how exactly the link […] happens.” For the others, it’s not a manifestation of anything in particular. It’s certainly possible that autogynephilic arousal in pre-trans women and non-dysphoric men are two completely different things that happen to involve common elements (much like how MtF transsexuality itself is two completely different things that happen to involve common elements!). But what’s the specific evidence?

The answer to how they are different things is straightforward enough. Ozy’s model seems to go something like this:

My guess as to Ozy’s model.

As you can see, true autogynephilia and sexualized gender dissatisfaction exist as different nodes in the causal graph, and therefore in Ozy’s model they are different things. Easy peasy.

Now, certainly I agree with MTSW that to support this specific model, one needs some evidence, and Ozy doesn’t seem to provide that. However, without more specific evidence, the alternate hypothesis isn’t that sexual feminization fantasies reflect the single unitary concept that I described in the beginning. Instead, the baseline approach could just as well be to consider them unvalid: that it is unknown what exactly they reflect1. In fact, the main reason I don’t consider them unvalid is not because there is a lack of evidence of their invalidity, but instead because I believe we have evidence otherwise that indicates that they are a valid indicator of something akin to autogynephilia proper. Specifically, we have an understanding of how sexuality in general works and what function it serves which suggests that they would indicate a sexual interest in being a woman.

But this is absolutely not a generally accepted understanding! It seems to me that it is quite common for people to come up with elaborate stories of sexuality as coping mechanisms, reflections of hidden desires, taboos, curiosity, etc.. I have not seen any particularly convincing arguments for why they should be so. But at the same time I can’t recall seeing any particularly convincing arguments for my preferred alternative hypothesis, that sexuality simply reflects sexual desires. Certainly, MTSW’s blog post doesn’t include them. Rather, these are arguments I have had to construct on my own.

Ozy also dismisses any apparent typology as merely correlational, which MTSW takes issue with:

“May or may not be correlated”?! That’s all you have to say?! Summarizing correlations is the聽entire point聽of making a taxonomy. Yes, psychology is complicated and people are individuals; no one is going to fit any clinical-profile stereotype聽exactly. But if we聽have studies聽that find correlations (not with correlation coefficients聽equal to one, but correlations nonetheless) between sexual orientation, age of transition, childhood femininity, and history of erotic cross-dressing鈥攊f, sheerly intuitively and anecdotally with no pretense of rigor, it聽seems plausible聽that the Laverne Cox/Janet Mock/Sylvia Rivera cluster of people is a distinct thing from the Julia Serano/Deirdre McCloskey/Caitlyn Jenner cluster of people鈥攊s it聽really that bad聽for someone to speculate, “Hey, maybe these are actually two and only two different psychological conditions with different etiologies”?

Like, maybe it’s not true. Maybe there’s some other, more detailed and expansive model that makes better predictions. But what is it,聽specifically? What’s your alternative story?

This response seems to engage in some motte/bailey’ing. It starts out arguing “summarizing correlations is the entire point of making a taxonomy”, but ends up at “maybe these are actually two and only two different psychological conditions with different etiologies”.

Certainly there are some contexts where correlations are useful – e.g. making predictions. But correlations have a lot of problems. There’s no guarantee that they are stable across time, or across contexts. There’s no guarantee that they will persist after applying pressure to them. There’s no guarantee that they tell us anything about reality. And indeed this is probably why it took less than a paragraph for MTSW to switch from talking about correlations to talking about etiologies (a causal concept).

There is an absolutely massive number of potential causal models that can fit to any given dataset. It’s probably fair enough to expect people to give an example of some factor, or factors, which could also account for observations. But at the same time, unless you give a strong reason to believe the model, you can’t expect people to buy it.

Personally, I find the actual causality important because I am researching how gender issues work, and this is deeply dependent on understanding the causality. If autogynephilia causes gender issues, then to understand gender issues, I can look for moderators or mediators of the effect of autogynephilia on gender issues, and I can meaningfully control for autogynephilia when looking at other potential causes of gender issues. Plus merely identifying autogynephilia as a cause is real progress in understanding. On the other hand, if autogynephilia is caused by gender issues, then identifying autogynephilia seems like only a curiosity, unless it turns out to be useful for some subtle reason. (Identifying repressors? Dubious.)

Others might have other priorities. One priority MTSW has is the prediction that AGPTSs are not going to be as female-typical as HSTSs. On the surface level, this might not seem to be relying as much on causality and correlations. However, as mentioned before, the idea that this is going to be a stable phenomenon does rely on causality.

Even if one has some context where causation isn’t relevant, claims of causation represent burdensome details. Certainly if one is interested in information compression, one can argue that trans women behave as if a certain causal story holds – though one should be careful about making sure that they actually do this; it can be nonobvious what a theory actually predicts, and summarizing the actual phenomena may be more accurate than trying to come up with a corresponding theory.

Finally, MTSW argues:

But here’s the thing: you聽can’t聽mislead the general public without thereby also misleading the next generation of trans-spectrum people. So when a mildly gender-dysphoric boy spends聽ten years聽assuming that his gender problems can’t possibly be in the same taxon as actual trans women, because the autogynephilia tag seems to fit him perfectly and everyone seems to think that the “Blanchard-Bailey theory of autogynephilia” is “clearly untrue”, he might feel a聽little bit betrayed聽when it turns out that it’s聽not聽clearly untrue and that the transgender community at large has been systematically lying to him, or, worse, is so systematically delusional that they might as well have been lying. In fact, he might be so upset as to be motivated to start an entire pseudonymous blog dedicated to dismantling your shitty epistemology!

Certainly, one reason that the BBL typology might be useful is that some males who are aroused by the thought of being a woman should transition, and this typology gives them an explanation of why.

But so does Ozy’s proposed theory of autogynephilia sometimes being a manifestation of gender dysphoria, and sometimes being “true autogynephilia”! And this is quite a popular theory that the trans community often gives as advice to those who are questioning.

In theory, perhaps, it might be that BBL can help with gender questioning more effectively. Certainly it seems like it should be able to break down the “am I trans or is it just a fetish?” question. But Ozy’s framework also sort of addresses this, by identifying the question with whether one wants to live as a woman in everyday life outside of sex. Is that a good solution? Probably not, the truth is most likely a better solution. But the choice isn’t between “autogynephilia has nothing to do with transsexuality” and “BBL is true”, there’s a whole universe of alternative options out there.


  1. As a footnote, in order for autogynephilic fantasies to be a valid measure of whether one is an autogynephile, one needs to have a definition of what autogynephilia is. While Blanchard aimed to produce such a definition (“propensity to be sexually aroused by the thought of oneself as a woman”, “love of oneself as a woman”), this definition was too vague to actually be useful. As such, autogynephilia is just reduced to a subjective judgement synthesized from fantasies and arousal patterns. This vagueness do in fact make it hard to test objectively whether someone is autogynephilic or not, which runs into the issue Ozy brought up about self-report-based theories having approaching ill-definedness once one starts bringing up lying. Ultimately lying is a thing, but Blanchardians have been ignoring the construct validation of autogynephilia for too long. And we will probably continue to ignore it unless I get it done.