The score answers one question: taking your body measurements as a set, how do they line up with the male and female populations in a reference dataset? The male population mean sits at −1, the female mean sits at +1, and your score is where you fall between (or beyond) those two anchors.
ANSUR samples active-duty military, which skews young and fit. NHANES samples the general civilian adult population. Pick whichever matches the comparison you want to make.
Under the hood, each dataset gives the tool a multivariate normal for each sex: mean vectors μm and μf, and covariance matrices that encode how measurements move together (taller people tend to have wider shoulders, etc.).
The score comes from Fisher’s Linear Discriminant Analysis (LDA). LDA finds the single axis through measurement-space along which the two groups pull apart the most: it maximizes the gap between the group means while shrinking the spread within each group. With a pooled covariance matrix Σ and the two mean vectors, the weight vector is
w = Σ-1 (μf − μm)
Your raw score is the dot productw·x of that weight vector with your measurements. Rescaling so that w·μm ↦ −1and w·μf ↦ +1 produces the number you see.
A single measurement like height throws away almost everything else you could know. A plain average of several measurements double-counts whatever is correlated: stature and sitting height vote twice even though they’re almost the same thing.
LDA uses the covariance matrix to cancel that redundancy. Measurements that genuinely separate the two populations get more weight; measurements that are noisy or already implied by what you’ve entered get less. Under the multivariate-normal assumption the reference data roughly satisfies, it’s the provably optimal linear classifier, not a heuristic.
The rescaling to [−1, +1] also gives the number a fixed geometric meaning. +0.5 is halfway from the male mean to the female mean along the best-separating axis. +1.5 is that distance again past the female mean. You can compare scores directly.
Because LDA is linear, your overall score decomposes additively. Each measurement contributes
contributioni = wi × (valuei − midpointi)
where the midpoint is halfway between the male and female means for that measurement. These contributions sum exactly to your overall score, so the Score column in the table tells you which measurements are pulling you in which direction and by how much with no approximation.
Setting a measurement to “control” mode holds that value fixed and asks: among people with this height (or whatever you controlled), how do my other measurements compare between sexes?
Mathematically this swaps the unconditional distributions for the conditional multivariate normal: new mean and covariance for the scored measurements given the controlled ones, then LDA re-runs on those conditional distributions. The weight vector w adapts to the slice you’ve fixed.
This matters because shoulder width means something different on a 160cm frame than on a 190cm one. Without controlling for stature, a tall person’s score picks up a chunk of their height through every measurement that correlates with height. Control for it and that inflation drops out, leaving the part of the score that’s actually about the measurements you’re scoring.
The chart shows where everyone in the active dataset falls on the same scoring axis as you. The blue and pink curves are kernel density estimates of the male and female distributions on that axis; the dashed white line is your position. When controls are active, the chart visualizes the conditional distributions the score is computed against, so the plot stays consistent with the number.
A low separability warning appears when the AUCof the scored measurements under the active controls falls below 0.75. That’s a signal that the measurements you chose don’t separate the groups well. Add more scored measurements, or drop a control, to strengthen the discriminator.