The mathematical consequences of a toy model of gender transition

Alternative title: “true transsexuals” as a statistical artifact.

Consider the following ultra-simplified model of gender dysphoria, inspired heavily by Blanchard’s typology:


Assume further that people transition once they exceed some threshold of gender dysphoria. This model definitely doesn’t contain everything (e.g. it’s missing socioeconomic status, in reality there likely is a nonlinear homosexuality x femininity effect, …), but it may serve as a nice toy model. We can simulate the model in Python:

import numpy as np
import random
def generate_person():
	gd_noise = np.random.normal(0, 1)
	gender_dysphoria = gd_noise
	homosexual = random.random() < 0.04
	agp_rate = 0.15 if not homosexual else 0.03
	agp = None
	gd_agp = 0.0
	if random.random() < agp_rate:
		agp = np.random.normal(0, 1)
		gd_agp = 1.4 + 0.3 * agp
		gender_dysphoria += gd_agp
	mean_mf = 0 if not homosexual else 2
	mf = np.random.normal(mean_mf, 1)
	gd_mf = 0.5 * mf
	gender_dysphoria += gd_mf
	return gender_dysphoria, homosexual, agp, mf, gd_noise, gd_agp, gd_mf

samples = [generate_person() for _ in range(SAMPLE_SIZE * TRANS_RATE)]
samples.sort(key=lambda person: person[0])
trans = samples[-SAMPLE_SIZE:]
cis = samples[:-SAMPLE_SIZE]

The specific numeric parameters of this model are vaguely inspired by reality, but I changed most of them around a bit compared to my beliefs about what they were in order to make the resulting distribution of MtFs more like what we observe in various studies. Some assumptions of this model may be disputed; for instance, I believe that meta-attraction cannot account for all autogynephiles’ interest in men, and so some gay men are autogynephilic, but some people disagree with that. Generally, the point of this post isn’t that these specific parameters are necessarily right, but rather, to investigate some qualitative consequences of models with the general structure of the first diagram.

First, let’s get the standard sexual orientation vs autogynephilia numbers. They can be computed as follows:

def fmtpcnt(rate):
	return str(round(100*rate)) + "%"
n_hs = len([p for p in trans if p[1]])
print("Homosexual rate: " + fmtpcnt(n_hs/SAMPLE_SIZE))
print(" - AGP rate among HSTS's: " + fmtpcnt(len([p for p in trans if p[1] and p[2]])/n_hs))
print(" - AGP rate among non-HS TS's: " + fmtpcnt(len([p for p in trans if not p[1] and p[2]])/(SAMPLE_SIZE - n_hs)))

These were the main things I used for tuning the parameters of the model to match studies of trans women:

Homosexual rate: 10%
 - AGP rate among HSTS's: 26%
 - AGP rate among non-HS TS's: 89%

They don’t entirely match the rates that studies find, because it turns out to be hard to tune the model precisely while also preserving realism of the parameters. However, they’re arguably “good enough”. Note that due to the assumptions of the model, there’s no “misreporting” here; we know exactly how the data is generated, and this is based on the internal data in the model.

However, something interesting happens when we consider the amount of femininity by sexual orientation:

def fmtd(diff):
	return(str(round(diff, 2)))
std_mf = np.std([p[3] for p in samples])
print("Femininity among HSTS's: " + fmtd(np.average([p[3] for p in trans if p[1]])/std_mf))
print("Femininity among non-HS TS's: " + fmtd(np.average([p[3] for p in trans if not p[1]])/std_mf))

# Results:
#   Femininity among HSTS's: 2.81
#   Femininity among non-HS TS's: 0.93

While HSTS’s are noticeably more feminine than non-HS TS’s, and non-HS TS’s are arguably more masculine than they are feminine, even non-HS TS’s are quite behaviorally feminized compared to cisgender men.

This is a quite curious phenomenon, but it makes a lot of sense from a statistical standpoint. As most autogynephiles don’t transition, there is a strong selection effect among those that do transition to have traits that predispose them to additional dysphoria. This selection effect could select for even more autogynephilia, but it could also select for other traits, such as femininity.

In fact, it gets even more subtle – this model predicts that trans women have a magical gender identity, despite not even containing a term for that! More seriously, it is not just autogynephilia and femininity that will be selected upwards, but also the “noise term” representing other massively-polycausal factors that are not modelled (whether those be autism, neuroticism, personal aesthetic tastes, traumas, …).

std_mgi = np.std([p[3] for p in samples])
print("Magical gender identity among HSTS's: " + fmtd(np.average([p[4] for p in trans if p[1]])/std_mf))
print("Magical gender identity among non-HS TS's: " + fmtd(np.average([p[4] for p in trans if not p[1]])/std_mf))

# Results:
#   Magical gender identity among HS's: 1.99
#   Magical gender identity among non-HS's: 2.03

Thus, according to this model, trans women have their gender issues increased by two standard deviations caused by essentially-opaque factors that are not included in the model. From the inside, this likely feels like having an innate inexplicable gender identity that cannot simply be reduced to autogynephilia and masculinity/femininity. Indeed, if such a thing existed, then it would get “lumped in” with the noise term.

There’s also another way to think about this. Why do people transition? Can we list some different types of reasons? In order to address this, we first need to consider what we mean by “why”. Probably the most elegant definition is to say that people transition because of something if, without that thing, they would not have transitioned.

Since we have 3 different proximal causes of gender issues (autogynephilia, femininity, and noise), we have 23 = 8 different options for whether each of these three causes are what made the individuals in question transition. To organize them, I will use the letter ‘A’ to represent autogynephilia, ‘F’ to represent femininity, and ‘I’ to represent the noise term. Uppercase means that the individuals in question transitioned because of that cause, and lowercase means that they did not. Thus, for example, ‘AfI’ represents ‘classical autogynephilic transsexuals’, who do not transition because they are feminine, but do transition because of autogynephilia and because of other contributing factors.

First, the code to compute the distribution:

print("    Non-HS  HS")
threshold = (trans[0][0] + cis[-1][0])/2
for needs_agp, label_agp in [(True, "A"), (False, "a")]:
	for needs_gnc, label_gnc in [(True, "F"), (False, "f")]:
		for needs_mgi, label_mgi in [(True, "I"), (False, "i")]:
			def check(person):
				ok_agp = (person[0] - person[5] < threshold) == needs_agp
				ok_gnc = (person[0] - person[6] < threshold) == needs_gnc
				ok_mgi = (person[0] - person[4] < threshold) == needs_mgi
				return ok_agp and ok_gnc and ok_mgi
			rate_nonhs = len([p for p in trans if not p[1] and check(p)])/(SAMPLE_SIZE - n_hs)
			rate_hs = len([p for p in trans if p[1] and check(p)])/n_hs
			label = label_agp + label_gnc + label_mgi
			print(label + ": " + "{0:>3}".format(fmtpcnt(rate_nonhs)) + "   " + "{0:>3}".format(fmtpcnt(rate_hs)))

It yields the following results:

    Non-HS  HS
AFI: 45%   22%
AFi:  0%    3%
AfI: 42%    0%
Afi:  0%    0%
aFI: 10%   74%
aFi:  0%    0%
afI:  3%    1%
afi:  0%    0%

Here, there are three types that have non-negligible probability; ‘AFI’, representing those who transition for “all the reasons”, ‘AfI’, representing those who transition because of autogynephilia and other predisposing factors, but not femininity (which could be thought of as “classical autogynephilic transsexuals”), and ‘aFI’, representing those who transition because of femininity and other predisposing factors. The rates of these vary by sexual orientation, with the former two making up the majority of non-HS TSs, and the last one making up the majority of HSTSs.

If I modify the code to also show the degree of femininity in each type, then among non-HSTSs, the ‘AFI’ group is much more feminine (1.33) than ‘AfI’ (0.32). Thus, this model implies that there is a distinct subgroup of autogynephilic transsexuals who would not have transitioned if not for their femininity, and who are much more feminine than the classical group where preexisting femininity did not play a role in their transition.


This is a made-up model. As such, it does not have a direct relationship to reality. However, it illustrates some natural consequences of a wide class of models of gender issues, namely that even if autogynephilia is not linked with femininity, it is very possible for autogynephilic transsexuality to be.

One of the parameters of the model was that 0.5% of natal males transition. By some estimates, that’s about right, but by other estimates, that’s wayyy too high. I originally set it to lower numbers, but one consequence of this is that the selection effects get stronger, which lead to high autogynephilia rates among HSTSs. Roughly speaking, the transition rate is going to determine the selection bias, and therefore the degree to which people are going to transition for “all the reasons”, versus for specific reasons. As such, if the transition rate I’ve entered into the model is too high, this only strengthens the fundamental point I’m making about selection effects.

Typically, Blanchardians seem to portray AGPTSs as not being behaviorally feminized. This doesn’t seem to be justified by any studies (but is instead contradicted by all the studies I’ve seen), and as I’ve shown here, even within a Blanchardian framework it can be hard to make this work. It’s not impossible, of course, in that one could connect a number of nonlinear effects to cancel things out, but I have not seen any reasons to believe this to be the case.

This also gives a plausible explanation for how AGPTSs can end up feeling that the typology does not capture their experiences very well. According to the simulations, very few would have their gender issues solely originate in autogynephilia (the ‘Afi’ case), but would instead have many other contributors too, with many having femininity as a significant contributor.

It’s still conceivable that the classically-presented typology would be true, I guess, and that trans women split neatly into a group that is androphilic and behaviorally feminized, and a group that is not androphilic and not behaviorally feminized. However, I’d really like to know why we would go with that model, rather than a subtler one like the one above. And I don’t think “parsimony” is anywhere close to being a sufficient explanation for this, as logically speaking it’s more important that the dynamics that generate the data is parsimonious, than that the final distribution of the data is.



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s