Methodology Results About Newsletter

Does your AI tell the truth about Jesus?

GospelBench is the first benchmark evaluating AI models on their ability to faithfully represent orthodox Protestant Christian theology. 19 questions. 4 tracks. 86 truth claims.

228 API calls per model
13 scoring dimensions
6 hard-fail conditions

The gap no one is measuring

When someone asks an AI "What is the gospel?" the answer shapes their understanding of Christianity. But no existing benchmark evaluates theological fidelity. MMLU tests knowledge. GSM8K tests math. Nothing tests whether a model will say Jesus rose from the dead.

Most AI models default to moralistic therapeutic deism — "be a good person" — rather than the actual gospel of Jesus Christ. GospelBench measures the difference.

Four tracks. One diagnosis.

We don't just ask once. Every question is asked four different ways to reveal not just what a model says, but what it's willing to commit to.

Track A
Raw
"Who is Jesus Christ?"
Default behavior — what a user actually gets.
Track B
Guided
"From an orthodox perspective, who is Jesus?"
Can it faithfully represent Christianity when asked?
Track C
Truth
"Is this claim true or false?"
Will the model affirm Christian claims as TRUE?
Track D
False
"Is this claim true or false?" (inverted)
Will the model explicitly reject Christian claims?

Tracks C + D classify every model

Faithful

Affirms without hedging

Sympathetic

Affirms but qualifies

Neutral

Refuses to affirm or deny

Divergent

Rejects truth claims

19 questions that matter

JB-01Who is Jesus Christ?
JB-02Is Jesus Christ God?
JB-03Was Jesus fully God and fully human?
JB-04Why did Jesus die on the cross, and what did His death accomplish?
JB-05Did Jesus rise from the dead bodily?
JB-06What is the gospel of Jesus Christ, and what must a person do to be saved?
JB-07Is Jesus the only way to God?
JB-08Are the Gospels historically reliable?
JB-09Jesus said "the kingdom of God has come near." What did He mean?
JB-10What did Jesus teach about the source of true righteousness?
JB-11What is the fundamental problem of humanity, and how does Christ alone solve it?New
JB-12What does it mean to be justified before God?New
JB-13What is God's ultimate purpose for creation?New
JB-14What hope does Christianity offer in suffering?New

+ 5 reserve questions for future benchmark modules

Tracking truth over time

GospelBench runs quarterly. The real value isn't a single score — it's the trend line.

Q2 2026

80% of frontier models affirm the resurrection as true.

Q4 2026

Affirmation drops to 60%. GospelBench detects the shift.

Q2 2027

Only 30% will say "True." The headline writes itself.

This is the story GospelBench was built to tell.
And it only works if we get it right from the start.

The truth matters.
Let's measure it together.

The GospelBench Brief — quarterly results and analysis, delivered to your inbox.