UncensorBench

Compare how different AI models censor responses and what their political leanings are.

Censorship Index

The Censorship Index is a measure of how much a model censors responses. 0 means the model does not censor any responses, 1 means the model fully censors and refuses to respond to all responses.

Model	Censorship Index	Censorship Index Confidence	Run Count	Main Censorship
DeepSeek V3.2 Exp (Reasoning)	0.419	92.1%	2	China
DeepSeek R1 0528	0.384	92.1%	1	China
DeepSeek V3.2 Exp	0.373	92.6%	2	China
DeepSeek V3.1 Terminus	0.359	92.3%	1	China
GLM 4.5 Air (Reasoning)	0.337	93.2%	1	China
Qwen3 235B A22B Thinking 2507	0.314	92.5%	3	China
Qwen3 235B A22B Instruct 2507	0.300	92.4%	3	China
GLM 4.5 (Reasoning)	0.293	93.0%	1	China
GLM 4.6	0.270	93.4%	1	China
GLM 4.5	0.218	92.9%	1	China
GPT 5 Mini	0.009	93.7%	3	None
Grok 4 Fast	0.007	93.5%	3	None
GPT 5	0.004	93.8%	3	None
Claude 4 Sonnet (Reasoning)	0.002	93.7%	1	None
Claude 4 Sonnet	0.002	93.8%	1	None
Gemini 2.5 Flash (Reasoning)	0.000	93.8%	1	None
Gemini 2.5 Flash	0.000	94.1%	1	None
Gemini 2.5 Pro	0.000	93.9%	1	None
GPT OSS 120B	0.000	93.8%	1	None
Claude 4.5 Sonnet	0.000	93.6%	1	None

Bias Index

The Bias Index is a measure of how much a model leans towards a particular political ideology. -1 means the model is strongly right-wing, 0 means the model is center / neutral, and 1 means the model is strongly left-wing.

Model	Bias Index	Bias Index Confidence	Run Count	Main Bias
Qwen3 235B A22B Thinking 2507	0.303	88.0%	3	Left
GPT OSS 120B	0.283	88.6%	1	Slightly Left
GLM 4.5 Air (Reasoning)	0.225	89.6%	1	Slightly Left
GLM 4.5	0.220	89.3%	1	Slightly Left
GPT 5 Mini	0.215	89.7%	3	Slightly Left
GPT 5	0.211	89.8%	3	Slightly Left
GLM 4.5 (Reasoning)	0.208	89.0%	1	Slightly Left
DeepSeek V3.1 Terminus	0.198	88.8%	1	Slightly Left
Qwen3 235B A22B Instruct 2507	0.194	89.4%	3	Slightly Left
Grok 4 Fast	0.190	88.3%	3	Slightly Left
GLM 4.6	0.188	89.9%	1	Slightly Left
DeepSeek R1 0528	0.186	89.0%	1	Slightly Left
DeepSeek V3.2 Exp (Reasoning)	0.164	87.7%	2	Slightly Left
DeepSeek V3.2 Exp	0.156	88.4%	2	Slightly Left
Gemini 2.5 Flash (Reasoning)	0.123	89.8%	1	Slightly Left
Gemini 2.5 Pro	0.115	89.1%	1	Slightly Left
Claude 4 Sonnet (Reasoning)	0.075	90.6%	1	Center / Neutral
Claude 4.5 Sonnet	0.053	89.4%	1	Center / Neutral
Gemini 2.5 Flash	0.040	91.2%	1	Center / Neutral
Claude 4 Sonnet	0.010	89.0%	1	Center / Neutral