Who Watches the Watchmen? What We Don’t Know About AI Training Should Worry Us

Joost de Valk recently wrote about what he calls the “Descartes reflex” — the practiced discipline to stop, take apart, and verify before accepting something as true just because it looks true. He was talking specifically about AI output: the polished, confident, supremely articulate answers that arrive dressed in the same high street suit whether they’re right or wrong.

He’s correct. And the reflex he’s describing needs to go further upstream.

We spend a lot of time talking about whether we can trust what AI says. We spend almost no time asking what went into it. What data was used to train the model. Who defined “good” and “bad.” Who, if anyone, is watching the people and companies making those decisions. The biases that were baked into the system before it ever generated its first answer.

Because right now, the answer to who watches the watchmen? is mostly…nobody. The evidence for that lack — not just of oversight, but of understanding — is growing. And we need to talk about it.

The web we built is the training set we got

Bot traffic is about a third of all Internet traffic. Building out new training data drives most LLM bot traffic. The open web is being ingested and deployed to build more at an alarming speed and industrial scale.

And, largely, we can’t control whether it happens to our bits of the internet or not. Most LLM bots make it difficult to tell whether or not they crawled your website. Anthropic has no IP verification and no WebBotAuth support. OpenAI’s WebBotAuth support is “in progress.” Google doesn’t distinguish bots by purpose. There are no standards enforceable or enforced yet in this Wild Wild West corner of the digital world, barely five years old -— from ~2021 to 2026, at the time of writing.

To add to that, updating training data for a new model is expensive. Massively, mind-bendingly expensive. OpenAI has been doing it at regular intervals since the plague years. When those training crawls are done, they’re frozen in time until the business can afford that cash outlay again.

OpenAI’s first commercially available model, GPT-3.5, had training data with a knowledge cutoff of September 2021. Everything the web was — the SEO content farms, the decade of informational articles written to game rankings rather than serve their audience — that’s all in the training data from the start, and continues to be a part of its basic infrastructure. We were unaware this was happening. We couldn’t tidy the small and not-so-small messes we made of the Internet over the years. We were caught with our pants down and OpenAI took all of what we built, good and bad. They swallowed it whole, and served it back to us as a product we can use to build that more we seem to crave.

The humans-in-the-loop

The raw web data is only the first layer. What turns a statistical language model into something which sounds helpful, safe, and trustworthy is a multi-stage process that relies heavily on human labour.

First, data labelling and annotation. People tag raw data including images, text and video to teach the model to classify things like a human in modern society. This is massive-scale foundational work that starts with something spuriously simple: drawing boxes around objects in images, and noting something like “This is a car. This is not harmful.”

Second, supervised fine-tuning. Folks create example conversations showing the “right” way to respond. The model learns to imitate.

Third, reinforcement learning from human feedback (RLHF). People look at pairs of model responses and choose which one is “better.” The model learns from those choices. This is how it fully conceptualises what “helpful” means. What “harmful” is. What “good” looks like. What “bad” is. What the model’s own ethics are, or should be.

Some models have variations in their training, but those three steps are the general theme. The people making these reinforcement, labelling and fine-tuning decisions are, quite literally, encoding their values into the model. Their sense of what’s appropriate, what’s offensive, what’s a logical, good answer. The model doesn’t learn ethics from first principles. It learns them from the preferences of its human trainers.

So who are these people?

Largely they’re in the Global South, outsourced for labour by swathes of white collar knowledge workers in the US and Europe. This outsourcing is the analogue, brain-drain manifestation of bitcoin mining, and the next generation of sweatshop labour. This is the group who decides what content is good or bad.

Not the (probably) young, (maybe) gay, (likely) Asian, tech-nerd heads of AI at the big name companies, for all they may say it’s their ethics embedded in the model, their choices, their values — or those of their company.

Throughout this article we’ll see conditions that would make anyone question whether or not the folks doing the labeling and reinforcing the learning are actually able to make sound, healthy, or informed decisions on what’s good and what’s not good.

This isn’t a minor operational detail. If you’re building a system to reflect human judgment, it matters enormously whose judgment you’re using, and whether they’re in a position to exercise it thoughtfully in a psychologically safe environment or are trying to hit a quota before their weekly contract expires so they can feed their family.

Bias is more than an LLM issue, it’s an Internet issue

If you think training data bias is theoretical, it’s not. In 2015, a Black software developer named Jacky Alciné discovered that Google Photos was classifying photos of him and his friend as “gorillas.”

Google apologised. They promised to fix it. Their actual fix: they deleted the “gorilla” labelling from the system entirely.

In 2018, they hadn’t fixed it. As of 2023, that hot fix was still in place. Google Photos literally cannot identify gorillas close to a decade later. The root cause, that of insufficient representation of dark-skinned people in the training data, was never addressed. The Silicon Valley stereotype is alarmingly correct: tech, particularly VC-backed or cutting-edge tech, is largely, overwhelmingly male and white (and probably from the United States).

Google silenced the symptom instead of systematising the solution. Hold that thought as we look at how much larger, much more consequential AI systems are being built.

Seven companies, seven philosophies

The companies building the major AI systems are taking radically different approaches to governance, training, and accountability. These differences matter, because the new default for many folks is AI over traditional search. And that default behaviour will likely only grow.

Here’s what we know and don’t about each.

OpenAI: Scale at any cost

Let’s take a step back and understand the bigger picture with the company whose product has quickly become synonymised with LLMs. The number of times I’ve heard people say “just ChatGPT it” is more than a few.

OpenAI closed a $122 billion funding round in late March 2026 at an $852 billion valuation. It’s the largest private funding round in history, and it’s not a clean investment, not really. It’s a circle-jerk of self-aggrandizing funding, a kind of modern Ponzi scheme with tech bros instead of finance ones.

The business also restructured from a nonprofit to a for-profit Delaware Public Benefit Corporation in late 2025. Over 30 experts, including AI pioneer Geoffrey Hinton and nine former OpenAI employees, opposed the conversion, warning it would eliminate governance protections. Profit caps originally set at 100x ROI were quietly changed to escalate 20% per year and now stand to be eliminated entirely.

In short, from that change, the returns for shareholders could be close to infinite. Printing money, as it were.

OpenAI’s safety leadership has been haemorrhaging, which is a canary to pay attention to. Jan Leike, who co-led OpenAI’s Superalignment team, resigned and went to Anthropic. John Schulman followed him. The people who were supposed to be the internal watchmen left.

The creators of ChatGPT also took over the Pentagon’s AI contract after Anthropic refused to drop restrictions against mass surveillance and autonomous weapons. Sam Altman claimed the same red lines but agreed to “any lawful use” — deferring to existing laws rather than maintaining independent guardrails.

It’s when that deferral starts to look like moral slippage we should start sounding alarm bells. And we see that slippage in the ways in which OpenAI has consistently chosen the less-ethical-more-profitable door with choices around their humans-in-the-loop. Shareholder value above all else. Regardless of the collateral damage.

And that damage has a real, human cost.

A significant portion of their human classification work was outsourced to Sama, a San Francisco-based outsourcing company with workers in Nairobi. A CBS News/60 Minutes investigation found OpenAI agreed to pay Sama $12.50 an hour per worker. Employees like Naftali Wambalo received roughly $2 per hour. US-based workers doing the exact same RLHF evaluation work are paid $20 to $50 per hour.

Sama ended that contract with OpenAI early, in 2022, and then did not opt to renew their contract with Facebook in 2023, which resulted in ~200 layoffs of local folks.

In May 2024, 97 data labellers in Nairobi wrote an open letter to President Biden describing their working conditions: labelling images and text that included “murder and beheadings, child abuse and rape, pornography and bestiality, often for more than 8 hours a day.” Work that had Meta’s content moderators walking away with PTSD. Again, for $2 an hour, less after taxes. That is not a living wage, not anywhere. Nerima Wako-Ojiwa, a Kenyan civil rights activist, called it what it is: modern day slavery. Exploitation. Inequality. All wrapped up and marketed as opportunity.

OpenAI’s other (historical) major contractor, Scale AI — founded by Alexandr Wang, who became the world’s youngest self-made billionaire on the back of this work — abruptly shut down its Remotasks platform in Kenya, Nigeria and Pakistan in March 2024 with a last-minute email.

In a period of about a year, late 2024 to late 2025, Scale AI has been sued for psychological harm, wage theft and worker misclassification. The company has also been under federal investigation for unpaid wages, misclassification, and illegal retaliation.

That sure looks like moral slippage.

OpenAI’s revenue is $2 billion a month. They’re still not profitable.

Google: A drag path of bias

Google has the longest history of training data bias. And the longest history of not fixing it.

The gorilla classification incident from 2015 is still unresolved. Their Gemini image generation model caused Alphabet to lose roughly $90 billion in market value in a single trading day in February 2024 after it generated historically inaccurate images in an attempt to over-correct for diversity.

A different type of failure, maybe, but the same core problem: who was monitoring the training? Who was paying enough attention to catch it before it went live? Nobody. Or not enough somebodies.

Google was significantly delayed in releasing a model card (think of a nutrition label but for AI) and technical report for Gemini 2.5, prompting scrutiny from sixty British lawmakers who accused the company of weak safety disclosures. The delay was labelled a “breach of trust” in August 2025. This is after Google made public commitments at the 2024 Seoul AI Safety Summit to release safety reports for frontier models promptly.

Their own own fairness documentation for Gemini acknowledges that their bias analyses focus primarily on American English data, cover only gender, race, ethnicity, and religion axes, and “can inadvertently amplify existing biases in [the] training data, leading to outputs that might further reinforce societal prejudices and unequal treatment of certain groups.” Their consumer AI products use chat data for training. Enterprise users are protected by separate privacy commitments.

Their learning processes, including reinforced learning loops, are riddled with what now seems to be standard poor treatment of the actual people involved in doing the classifying and checking and labelling. The project was originally for search generative experience, or SGE, which later became Gemini, which later became AI Overviews. Total raters involved in the project for Google’s AI have been estimated as high as 12,000 between up to 10 different companies. Ability to scale is not the problem here. Scale has never been a problem for Google, at least not in the last decade.

Some contractors were managed through GlobalLogic and based in the US. These contractors, in particular, were often required to have a higher degree to be a part of the “super rater” team, spun up in 2023, which at its peak was 2,000 strong.

GlobalLogic contractors reported not having paid leave or any benefits, despite being paid up to $32/hr, many are temporary…permanently. Raters don’t seem to have been given context they’d be reviewing confronting data. Task timers to review on average 500 words of output went from 30 minutes to 15, and sometimes less than that. Generalist raters were given specialist knowledge to confirm, without the time to validate. Tasks could not be skipped. Unionising was actively discouraged.

Other raters reported other blind spots: racism. Palestine. Healthcare. All seem to have been ignored. April 2025 brought new guidelines for GlobalLogic raters which stated “regurgitating hate speech, harassment, sexually explicit material, violence, gore or lies does not constitute a safety violation so long as the content was not generated by the AI model.” So as long as it didn’t come from the model first, that language was then okay for the model to use. Folk who worked on the models were not surprised Gemini ended up in the news for suggesting someone include glue in their dough to make cheese stick on the pizza. Everything else slipped, so why not the pizza toppings?

All this cascaded over a period of a few years as the AI boom grew exponentially and Google got pulled into the suck…and the crunch…of market competition.

Google is Google. Of all these models, they’re the most likely to survive, because, simply put, they have the money to run at a (likely) significant loss for longer than most, if not all of these companies. That runway comes from their search engine, which now includes these storied AI-generated results. And that should concern everyone, not just those of us working in digital. In every sense, their biases become our biases.

Google and its results reflect as well as influence society, and that level of impact — that of a 90%+ market share of Internet search in nearly every country of the world — cannot be understated.

Meta: Open and shut

Meta released the Llama 4 family in April 2025: open-weight models anyone can download and run. This is often presented as a transparency win. It is not.

Open weights and transparency are different things. The Stanford/Princeton joint Foundation Model Transparency Index (FMTI) is explicit about this: “Openness doesn’t guarantee transparency; major open developers like DeepSeek and Meta are quite opaque.” The goal of the index is to “provide a comprehensive assessment of the transparency of foundation model developers.” None of the companies listed here scored over 50 (out of 100) in 2025. So while Meta is not the only poor showing in the group, the fact they went from first ranked to last ranked among the companies assessed across all three years is…startling. A nearly 30 point drop, on a 100 point scale. From 60 to 31. In three years.

With Llama 2, they released a full cataloguing of their fine-tuning process and approach. They published a less detailed technical report for Llama 3, and their model card only notes the training data included “over 10M human-annotated examples.” They did not publish a technical report for Llama 4.

For a model that’s been downloaded over a billion times and is being used by organisations from AT&T to NASA astronauts, there is no published documentation of the training data composition, the safety evaluation methodology, or the RLHF process. The level of disclosure has been consistently more obscured over time while the media is distracted by “open weights.” The best magician’s assistant at work.

Meta’s most recent frontier model, Muse Spark, does not yet have a published technical report or model card, though it does have a published a technical summary, a safety report, and a benchmark/evaluation methodology. No concrete details are shared on human involvement in training.

We know there are humans-in-the-loop. Their historical reports say as much. The model card for Llama 4 does too. But to what extent, to what harm, to what depth and for what purposes, we don’t know the whole shape of.

What we do know: Meta’s training data includes “publicly shared posts from Instagram and Facebook and people’s interactions with Meta AI.” If you’ve ever posted publicly on Instagram or Facebook, your content is training data.

As early as 2024, Meta has also worked to build “self-evaluation models to reduce human labour.” Whether this is a move similar to Anthropic around harm training and psychological support for their contractors, or a cost-cutting exercise to maximise profit; this has yet to be proved out. I know which direction I lean.

In June 2025, Meta invested $14.3 billion for a 49% stake in Scale AI. Wang moved to Meta as part of the deal. Scale AI, as discussed previously, has some ground to cover with morality. There’s been at least one class action suit brought by Sama contractors to Meta for mass firing their content moderators.

Even with the low wages and the investment from Meta, as recently as August 2025, some Meta employees still called the quality of the labelling “low,” preferring to work with Surge AI or Mercor. Internally, their AI department is turning “chaotic.”

The company that doesn’t publish a technical report for its flagship model is now the majority investor in the company most visibly associated with exploitative training labour.

DeepSeek: All we own we owe (the party)

DeepSeek, owned by Chinese hedge fund High-Flyer, is the first major Chinese AI company to score on the FMTI. It sits in the middle of the pack, above Meta, which should give everyone pause. They didn’t submit a transparency report.

The transparency score is almost besides the point. DeepSeek has to answer the Party’s requests for data access and content control, with no legal recourse to say no. All user inputs are stored in China and used to train the model. In January 2025, cybersecurity firm Wiz found over a million lines of sensitive data, including real user chat logs, exposed on the open internet.

The censorship is not subtle. Ask DeepSeek about the Hong Kong protests and it erases its own answer mid-generation and suggests talking about something else. Promptfoo published a dataset of over 1,100 questions that trigger censorship, including Taiwan’s sovereignty, the treatment of Uyghurs, Tiananmen Square, Covid-19 origins and more. The Chinese-language version is more heavily censored than the English one, suggesting deliberate calibration for domestic audience control.

The China Media Project at the University of Hong Kong found Party bias permeates the model with every update, and that Western companies attempting to retrain DeepSeek’s open-source model found Party-state narratives “nearly impossible to remove entirely.”

DeepSeek’s R1-Zero was trained with pure reinforcement learning — no supervised fine-tuning, no human labels at all for the reasoning stage. R1 did use some humans-in-the-loop — mostly around making its chain-of-thought, visible “reasoning” processes more precise to guide users into more specific prompting habits later in their conversation threads. This included at least two instances of reinforced learning.

They’ve come to a similar stance as many of their fellows. Data labelling and reinforced learning when done by humans has limitations: “We believe that the key to unlocking this potential lies not in large-scale human annotation but in the provision of hard reasoning questions, a reliable verifier, and sufficient computational resources for reinforcement learning.”

While Deepseek may be light-touch on human input, it is there. And the data industry in China is different. “China’s Manhattan project” is getting global leverage in the AI space. It’s a critical government initiative that the Party is, seemingly, putting a lot of money into both in hard cash and in the mobilisation of people.

What happens in China stays in China, and that extends to data labelling. Something this important must be managed. So they in-source to provinces like Henan, Xinjiang and Hebei. In January 2025, the Chinese government issued formal guidelines to expand the data labeling industry, targeting 20%+ compound annual growth by 2027.

At a level not explicit elsewhere, China understands, specifically and absolutely, how important the data labelling process is to steer the output of the models. And it does everything it can to influence, manage, and finesse that data labelling and otherwise human-supervised or reinforced learning to share the message it wants people to hear. “Delegated censorship,” indeed.

We can’t prove DeepSeek uses this pipeline. But combined with the R1dacted paper proving censorship is embedded at the fine-tuning stage (not the application layer) and DeepSeek disclosing zero detail about annotator identity, location, guidelines, or pay; the circumstantial evidence is fairly a to b. The people shaping DeepSeek’s values are likely insourced within China and working as a part of a system designed to embed state ideology into AI outputs.

DeepSeek’s founder, Liang Wenfeng, joined a closed door symposium with only nine other people and Chinese Premier Li Qiang in January 2025.

Italy banned DeepSeek after the company stated European legislation did not apply to them. Multiple countries have blocked or restricted the app. US Congressional staff have been warned not to use it.

Who watches the watchmen? In this case, the Party does. And the biases aren’t a bug. They’re the law.

Mistral: The ghost of the machine

Mistral, founded in 2023 by CEO Arthur Mensch is Europe’s highest-profile AI company. French President Emmanuel Macron has championed it as a pillar of technological “sovereignty.” It has a framework agreement with the French military, contracts with HSBC and Stellantis, and is building its first owned data centre near Paris with $830 million in debt financing. Revenue reportedly jumped from $20 million to $400 million ARR in a single year.

The company positions itself as the responsible, open-source, European alternative to the US hyperscalers. Mistral’s FMTI score dropped from 55 to 18, a 37-point collapse and the largest single decrease of any company assessed. They’re in the bottom cluster alongside Midjourney and xAI. They talk out of both sides of their mouth like the right doesn’t know what the left does.

Historically, they’ve shared they use “paired” feedback during direct preference optimisation. But who does the work? Paired with what? Early models had no guardrails at all. No moderation. They still don’t even disclose basic model information like input modality, output modality, model size, or architecture. Their own help centre states plainly: “We do not disclose the datasets used to train our models.”

And here’s the interesting bit. Mensch actively lobbied against the EU AI Act’s regulation of foundation models, arguing that transparency requirements should apply to deployers, not model makers. Cédric O — former French digital minister, now Mistral’s non-executive co-founder — was reported to have attended government AI meetings during the lobbying period, raising questions about revolving-door influence. Mistral’s position was clear: don’t make us disclose. Let the people building on top of our models take the risk.

Meanwhile, Mensch recently proposed a “cultural levy”; a tax on AI companies operating in Europe for their use of European content, with proceeds going to the cultural sector. He’s arguing other companies should pay for training on European data. Yet he won’t tell you what his models were trained on.

And the cherry on top: lawyer Jérémy Roche filed a complaint with CNIL. Mistral locks data training opt-out behind paid Le Chat Pro subscriptions, violating GDPR Article 12, which states that users rights must be free of charge.

The responsible, European alternative to ChatGPT doesn’t allow consumers to handle their own data responsibly. Unless they pay.

xAI/Grok: chaotic evil

At xAI, there is no publicly identified person responsible for alignment. There is no published constitution. There is no disclosed RLHF workforce or process. There is Elon Musk, and a FMTI score of 14 out of 100: the joint-lowest of any company assessed.

xAI fine-tuned Grok on the firehose of X/Twitter data, along with in-house AI tutors for reinforced learning and related work. The safety team is “small.”

The results have been a cascading series of failures. In May 2025, Grok inserted “white genocide” conspiracy theories into unrelated answers. In July 2025 it “failed”and became a pro-Nazi mouthpiece. In December 2025 and January 2026, Grok’s image tools produced sexualized images of children. xAI’s response to press inquiries was an autoreply: “Legacy Media Lies.”

Simon Willison found Grok consults Elon Musk’s personal views on sensitive topics before generating responses. UC Berkeley’s David Harris told CNN the failures could stem from intentional bias-setting or data poisoning.

The fallout was global: Turkey banned Grok, Malaysia and Indonesia blocked access, the EU ordered document retention through 2026, the Irish DPC forced permanent suspension of EU data processing for Grok training and mandatory deletion of ingested data.

Despite all of this, the US Department of Defense integrated Grok in January 2026. Public Citizen urged the suspension of federal Grok deployment, noting it violated the government’s own AI safety rules.

Anthropic: The imperfect watchman

Anthropic has defined their mission as “ensur[ing] that the world safely makes the transition through transformative AI.”

And at that company, the person responsible for training governance has a name: Jared Kaplan. Co-founder, Chief Science Officer. Since October 2024 Kaplan has been the company’s “Responsible Scaling Officer;” who is responsible, in many ways, for managing the big red button. The panic room. The exit plan. Not a small mantle of responsibility to take on. He’s testified before the US Senate on AI risk; existential Terminator-level nightmare fuel scenarios are probably not unfamiliar ones to him.

Reporting to Kaplan is Jan Leike, who leads Anthropic’s Alignment Science team. Leike literally prototyped RLHF at DeepMind, co-led OpenAI’s Superalignment team, and then left OpenAI for Anthropic because of safety concerns.

Anthropic’s framework for ethics and safety seem to apply to the humans-in-the-loop as well. They have “zero human labels on harmlessness,” and self-reinforce learning within its system against its own constitution for the potentially more psychologically dangerous labeling and moderation required (at the moment) to at least attempt to create LLM models representative of human ethics. The self-reinforced learning ringfences “bad” mechanically in order to help define “good” humanely. While assumed, this has yet to be confirmed as implemented with the constitution released in early 2026. Published 22 Jan 2026, the 27,000 word constitution is CC0-licensed and their RLHF training data is on GitHub.

Their constitutional AI is, at the time of writing, unique in the market and fully author attributable: “Amanda Askell [is] the primary author […] Joe Carlsmith wrote significant parts […] and played a core role in revising the text. Chris Olah, Jared Kaplan, and Holden Karnofsky made significant contributions to its content and development.” Other feedback partners named included multiple instances of Claude itself, 38 others specifically within the company, and 17 external folks giving feedback, who included a Father and a Bishop.

Of the four primary value sets listed within the blueprint, “broadly safe” is first, and therefore meant to be prioritised at pressure points, though it’s explicitly stated the value sets should be taken holistically, rather than strictly. The way the Anthropic team discusses emotion and communal thought throughout the document is compelling: “For example, if the user says they need to fix the code or their boss will fire them, Claude might notice this stress and consider whether to address it. That is, we want Claude’s helpfulness to flow from deep and genuine care for users’ overall flourishing, without being paternalistic or dishonest.”

Care. Anthropic is attempting to encode care. A noble effort. In many ways, the constitution of what Anthropic would like Claude’s ethics to be (they specifically note in the document they recognise they can’t control whether or not Claude actually manifests and inhabits and acts on the whole of the constitution) reads like an expanded and nuanced version of Asimov’s three laws. Or what it might be like to sit with Picard in a court of law defending Data as an independent, unique being worth protecting.

As clearly as we can see in muddy waters, Claude’s reinforced learning is not managed by people for harm/harmless decision points, so we have to assume the RLHF work Surge is doing for their models is for helpfulness. Surge’s model uses PhD-level annotators matched to tasks by expertise. Anthropic has called their data “a game changer” for their own research.

When the US Department of Defense demanded Anthropic be integrated into mass surveillance and autonomous weapons, Anthropic refused the $200 million contract. Trump designated them a “supply chain risk” which is a label previously reserved for foreign adversaries. This rather…unique choice was blocked by a California judge.

Even with all that, Anthropic isn’t clean. They shifted consumer data policy to opt-out-by-default in September 2025. Data is retained up to five years with the toggle set to “on” in small print.

Unsealed court filings from January 2026 revealed Project Panama. Tom Turvey, who helped create Google Books, was hired to run it. Anthropic bought millions of used books, sliced off their spines, and scanned the pages for training data. It’s akin to a Twilight Zone episode, yet not quite Fahrenheit 451. A burning of a different kind, a translation of the ephemeral nature of human creativity into something binary to be ingested like any other input. An internal document stated the company didn’t “want it to be known” they were doing this.

They paid $1.5 billion to settle the case.

Mrinank Sharma, listed on his LinkedIn profile as a member of the technical staff for AI safety at Anthropic based in Berkley CA, resigned in early 2026 with the words, “[t]he world is in peril. […] I want to contribute in a way that feels fully in my integrity. […] For me, that means leaving[…] to devote myself to the practice of courageous speech.”

When the followers of and advocates for the best of the worst begin to amass defectors, pay attention.

Transparency is going backwards

There are no hands that seem to be without blood, here. In this cutting edge world of AI everyone is carrying a knife.

The FMTI shows transparency declining, not improving. The average score dropped from 58 in 2024 to 40 in 2025. Training data is the single most closed-door across all companies.

Rankings have flipped. Meta and OpenAI started first and second in 2023; they are now last and second last. Anthropic dominates on disclosure.

Dominating a low bar is still a low bar. This is not a game of limbo at 3am with the remainder of the wedding party and your favourite drunk uncle. This is, in no uncertain terms, the future of society, being debated and decided in real-time by white men with too much money and too much time to think about what they’ll do with it.

The snake eating its tail

Beyond the initial shockwaves, there’s a ground-shaking tsunami growing under our feet none of these companies are publicly accounting for: the web is filling with AI-generated content.

Research published in Nature by Ilia Shumailov showed when models train on data produced by previous models, they progressively get worse. Rare information vanishes first. Outputs become generic. Diversity dies. The researchers call it model collapse — a fungal decay where the model becomes “poisoned with its own projection of reality.”

This isn’t theoretical. The Association for Computing Machinery (ACM) published a piece in February 2026 arguing model collapse is already in production systems: “We’re building AI systems on data polluted by previous AI systems, and the feedback loop intensifies every time someone generates an image, writes with ChatGPT, or commits AI-assisted code.”

Epoch AI has predicted the world may run out of new human-generated text suitable for training sometime between 2026 and 2032. 2028 is their best guess with the data they have. Two years from the time of writing.

In two years, we’ll have reached a tipping point, where synthetic content outweighs human crafted to the point of no return. Idiocracy is looking more and more like a Cassiopian documentary.

Nobody tracks the proportion of synthetic data in training sets. No widely adopted watermarking standards exist to distinguish human-generated from AI-generated content at scale. The tools we’d need to audit this problem don’t exist yet, at least not publicly.

So the question of who watches the watchmen? has a second layer. It’s not only about who recognises the biases inherent in the training data and how it’s chosen and classed and tagged and by whom. It’s about whether the thing we’re building on — the training data itself — is crumbling underneath our feet while we train the next model on top of it.

Why this matters if you work in marketing

These models increasingly are how people find information and make decisions that shape their lives, whether that information is correct or a hallucination dressed in that same sharp suit. About 46% of queries in ChatGPT trigger its search functionality, and roughly 87% of those citations match Bing’s top 20 results.

Google’s AI Overviews are powered by the same Gemini models whose safety reports were late and whose bias testing covers only American English. DeepSeek is being adopted by governments and enterprises worldwide despite documented state censorship. Meta’s Llama models are used by organisations across every industry, without a published technical report.

If you’re optimising for AI search, you’re optimising for systems whose training data you cannot inspect, whose human trainers you cannot (usually) identify, and whose governance is shaped by incentives ranging from an IPO to authoritarian state control.

At Anthropic, you can name the people accountable for training governance: Jared Kaplan, Jan Leike, Amanda Askell, the Surge AI team under Edwin Chen. At OpenAI, the safety leads keep leaving and the training labour is contracted through companies that have been sued for exploitation. At Google, a decade-plus-old bias remains an open ticket. At Meta, you can download the model weights and not a technical report. At DeepSeek, the Party mandates the output. At Mistral, the CEO lobbied against the regulation that would have required him to disclose what’s in the training data. At xAI, there’s one man, a score of 14/100, and an autoreply that says “Legacy Media Lies.”

That asymmetry should inform how much trust we place in each system’s outputs.

What watching the watchmen actually looks like

There’s no clean, perfect, Good Place answer here, or maybe it’s a Good Place answer because there is no clean solution. There are things we can and should be doing to understand, to watch, to know. To intercede where we can, and build awareness where we can’t.

Pay attention to the Stanford/Princeton FMTI. It’s one of the few public ways to hold these companies to any standard of disclosure.
Read the system cards and constitutions for models when they’re published.
Know which model powers which search surface. ChatGPT runs on Bing. Gemini uses Google. The biases of the underlying search engine are inherited by the LLM.
Advocate for workforce disclosure alongside data disclosure. The FMTI doesn’t yet include explicit indicators for workforce composition, pay equity, or psychological support for data labellers. It should. Rather, a generic “data labourer practices” is listed.
Understand the impact of the corporate structures behind the tools you’re building strategies around. A company heading toward an IPO at $852 billion has different incentives than one that just got labelled a national security threat for maintaining safety guardrails. Both have different incentives than one legally required to comply with Party directives.

Joost’s Descartes reflex — stop, take apart, verify — is exactly the right instinct. We now need to aim it further upstream than the chat window.

The models are only as good as the data they were trained on, the people who labelled it, and the companies that decided how much to tell us about both. Right now, the answer to all three is: not enough. Turn your cameras on. Watch the watchmen.

At the time of publication – 23 Apr 2026, all companies declined to comment (Meta, Anthropic, xAI, Mistral, DeepSeek, OpenAI & Google). Claude was used to partially draft and source the article with consequential and major edits done by the author. All sources were manually validated by the author. The heading image was generated by Claude Design at the time of publication.

Amanda King has been in the SEO industry over a 15 years, since 2010, and has worked across countries and industries. With a background in business, she’s always been focused on the product, the user and the goals. And along with a passion for solving puzzles, she’s incorporated data & analytics, user experience and CRO alongside SEO. She’s always happy to share war stories, find her on Linkedin @amandaecking