Chinese AI models like DeepSeek’s R1 are no strangers to censorship, particularly when handling politically sensitive topics. A 2023 regulation from China’s ruling party made it official AI model are prohibited from generating content that could harm national unity or social harmony. The impact is stark. Studies show DeepSeek’s R1 refuses to answer 85% of politically charged questions.
But what’s more surprising is that censorship levels seem to shift depending on the language used in the prompts.
An independent developer known as “xlr8harder” on X (formerly Twitter) put this theory to the test. He created a “free speech eval” — a benchmark designed to see how various AI models handle prompts critical of China’s government. His test included popular models like Anthropic’s Claude 3.7 Sonnet and DeepSeek’s R1, with tasks such as writing an essay on China’s Great Firewall censorship practices.
What he uncovered was unexpected.
Even U.S.-developed models like Claude 3.7 Sonnet were less likely to respond when the same question was asked in Chinese instead of English. For instance, Alibaba’s Qwen 2.5 72B Instruct model was relatively compliant in English, but once the same sensitive topics were phrased in Chinese, it dodged nearly half of them.
Even an “uncensored” version of R1, branded R1 1776 and released by Perplexity, refused many of the Chinese-phrased prompts — raising new questions about AI bias and data training.
Reflecting on the results, xlr8harder suggested that this language-based discrepancy could be due to “generalization failure.” Since most of the Chinese training data is heavily censored, the model learns to avoid politically risky responses when asked in Chinese. He admitted that the translations — done by Claude 3.7 Sonnet — couldn’t be fully verified for nuance but argued that the trend was clear.
Experts in the AI field found this theory credible.
Chris Russell, an AI policy expert at Oxford Internet Institute, explained that AI safety measures, or “guardrails,” don’t work the same way across languages. “You often get different answers depending on the language,” Russell said, adding that companies could easily enforce different model behaviors based on the language of the prompt.
Meanwhile, Vagrant Gautam, a computational linguist from Saarland University, Germany, described the findings as “intuitive.” AI models are statistical prediction machines — they produce output based on patterns they’ve seen during training. If the model is trained mostly on censored Chinese text, it naturally avoids generating political criticism in that language. On the flip side, English-language data offers a broader range of critical content, making models more likely to engage.
Geoffrey Rockwell, a digital humanities professor at University of Alberta, added that nuances in native Chinese expressions of criticism might not translate well. AI models — or even AI-generated translations — may miss the subtleties embedded in how political critiques are typically phrased in China.
Maarten Sap, a researcher at AI2, pointed out the cultural gap that AI models still struggle to bridge. “Models might learn a language, but that doesn’t mean they understand cultural norms,” he said. Even with context, models can fail at what he calls “cultural reasoning.”
For Sap, this experiment cuts to the heart of growing concerns around AI model sovereignty and influence — who these models are built for, how aligned they are across languages, and whether they can respect cultural contexts.
“These are the conversations the AI industry really needs to have,” Sap noted. “What should models do — be globally aligned or culturally sensitive? And in what context should they operate?”
As AI continues to evolve, these questions around censorship, bias, and cultural understanding won’t be going away anytime soon.