Who Watches the Watchmen? What We Don’t Know About AI Training Should Worry Us

He’s correct. And the reflex he’s describing needs to go further upstream.

We spend a lot of time talking about whether we can trust what AI says. We spend almost no time asking what went into it. What data was used to train the model. Who defined “good” and “bad.” Who, if anyone, is watching the people and companies making those decisions. The biases that were baked into the system before it ever generated its first answer. 

Because right now, the answer to who watches the watchmen? is mostly…nobody. The evidence for that lack — not just of oversight, but of understanding — is growing. And we need to talk about it. 

The web we built is the training set we got

The humans-in-the-loop

Bias is more than an LLM issue, it’s an Internet issue

Seven companies, seven philosophies

The companies building the major AI systems are taking radically different approaches to governance, training, and accountability. These differences matter, because the new default for many folks is AI over traditional search. And that default behaviour will likely only grow. 

Here’s what we know and don’t about each.

OpenAI: Scale at any cost

Google: A drag path of bias

Meta: Open and shut

DeepSeek: All we own we owe (the party)

Mistral: The ghost of the machine

xAI/Grok: chaotic evil

Anthropic: The imperfect watchman

Transparency is going backwards

There are no hands that seem to be without blood, here. In this cutting edge world of AI everyone is carrying a knife.

The FMTI shows transparency declining, not improving. The average score dropped from 58 in 2024 to 40 in 2025. Training data is the single most closed-door across all companies.

Rankings have flipped. Meta and OpenAI started first and second in 2023; they are now last and second last. Anthropic dominates on disclosure.

Dominating a low bar is still a low bar. This is not a game of limbo at 3am with the remainder of the wedding party and your favourite drunk uncle. This is, in no uncertain terms, the future of society, being debated and decided in real-time by white men with too much money and too much time to think about what they’ll do with it. 

The snake eating its tail

Why this matters if you work in marketing

What watching the watchmen actually looks like

There’s no clean, perfect, Good Place answer here, or maybe it’s a Good Place answer because there is no clean solution. There are things we can and should be doing to understand, to watch, to know. To intercede where we can, and build awareness where we can’t. 

  • Pay attention to the Stanford/Princeton FMTI. It’s one of the few public ways to hold these companies to any standard of disclosure.
  • Read the system cards and constitutions for models when they’re published.
  • Know which model powers which search surface. ChatGPT runs on Bing. Gemini uses Google. The biases of the underlying search engine are inherited by the LLM.
  • Advocate for workforce disclosure alongside data disclosure. The FMTI doesn’t yet include explicit indicators for workforce composition, pay equity, or psychological support for data labellers. It should. Rather, a generic “data labourer practices” is listed. 
  • Understand the impact of the corporate structures behind the tools you’re building strategies around. A company heading toward an IPO at $852 billion has different incentives than one that just got labelled a national security threat for maintaining safety guardrails. Both have different incentives than one legally required to comply with Party directives.

Joost’s Descartes reflex — stop, take apart, verify — is exactly the right instinct. We now need to aim it further upstream than the chat window.

The models are only as good as the data they were trained on, the people who labelled it, and the companies that decided how much to tell us about both. Right now, the answer to all three is: not enough. Turn your cameras on. Watch the watchmen.

At the time of publication – 23 Apr 2026, all companies declined to comment (Meta, Anthropic, xAI, Mistral, DeepSeek, OpenAI & Google). Claude was used to partially draft and source the article with consequential and major edits done by the author. All sources were manually validated by the author. The heading image was generated by Claude Design at the time of publication.