ADMIN
White by Default?

The viral posts were right.

We scraped 5.5 million criminal records and 1.5 million mugshots from 39 states.

29% of Hispanics are being misclassified as White in official Department of Corrections databases.

Even when Hispanic is explicitly classified 🧵Image
Everyone's seen these collages claiming non-whites get classified as White in criminal databases.

The problem? Anecdotal. Cherry-picked. No way to verify.

We had 1.5 million mugshots, names, and official racial classifications.

Time to test it systematically:Image
We trained a multinomial logistic regression on 18 features:
• DeepFace racial probabilities from mugshots
• Census name demographics
• First and last name racial statistics

92.76% accuracy distinguishing Black, White and Hispanic.Image
The key insight: A sufficiently accurate linear model trained on biased data learns the TRUE signal, not the bias.

Systematic deviations between predictions and official labels indicate mislabeling by authorities, not model error.

Here's what we found....Image
29% of predicted Hispanics were officially classified as White.

Even at 95-100% model confidence, 22.4% of predicted Hispanics were still assigned White.

Median confidence for these cases? 91.7%.Image
Image
Visual inspection confirmed it.

These are people classified as "White" in official records. Look at those names!Image
Furthermore, PC mapping revealed that many "Whites" were in Hispanic variable zones, but not the other way around. Measuring the euclidean distance from the centroids, Whites were just as distinguishable from Hispanics as Blacks were from Whites.Image
Image
To further confirm the validity of our method through visual inspection, we contrasted low and high confidence classifications. High confidence misclassifications almost always were the predicted race instead of the assigned race.Image
Image
We corrected for misclassification:

Hispanic criminal record rates increase 20-31%
White rates decrease 4-6%
Black rates decrease 1%

The lowerbound being only high confidence reassignments (>90% confidence), the upperbound assuming all predicted = actual race.Image
State-level analysis showed massive variation.

Florida: 60%+ of Hispanics misclassified as White (Cubans tend to self-id as White?)

But no correlation with political ideology (r = 0.21, p = 0.472).Image

• • •

https://threadreaderapp.com/thread/1991295113344225596.html

You need to be a member of Command Center to add comments!

Join Command Center

Email me when people reply –

Replies

  • Several issues. First is classification by the color of the skin and the other is the political impact if classified by race. Race and skin color is two different things. There are some very light black people and some very dark white people therefore they are misclassified by skin color.

    OBTW - Caucasians are classified as 'white'. Negros classified as Black.  Why aren't American Indians classified as Red? Instead they are classified as White.  Why? And why aren't Asians classified as Yellow? 

This reply was deleted.