fix: ContextPrecision now scores by position of relevant context (clo… by NishthaNabya · Pull Request #178 · braintrustdata/autoevals

Nishtha Nabya (NishthaNabya) · 2026-03-17T17:08:54Z

Fixes #176

The issue is that when you pass a list of context chunks, the scorer was joining them all into one string before sending to the LLM. So it was basically asking "does the answer exist anywhere in this blob of text?" which always returns 1 if the relevant chunk is in there, regardless of where it sits in the list.

But that's not what ContextPrecision is supposed to measure. The whole point is to reward retrievers that surface relevant context first. If your relevant chunk is buried at the bottom, the score should reflect that.

The fix scores each chunk individually, then applies the standard RAGAS positional formula:

score = sum(precision_at_k * verdict_k) / total_relevant

So for the example in the issue (relevant chunk is second):

Chunk 1 (Brandenburg Gate): not relevant = 0
Chunk 2 (Eiffel Tower): relevant = 1
Final score: 0.5 which matches what the real RAGAS library returns

Also made sure the scorer still works when context is passed as a plain string (not a list), so it's fully backward compatible.

braintrustdata#176)

fix: ContextPrecision now scores by position of relevant context (closes

9d01d64

braintrustdata#176)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: ContextPrecision now scores by position of relevant context (clo…#178

fix: ContextPrecision now scores by position of relevant context (clo…#178
Nishtha Nabya (NishthaNabya) wants to merge 1 commit intobraintrustdata:mainfrom
NishthaNabya:main

Nishtha Nabya (NishthaNabya) commented Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Nishtha Nabya (NishthaNabya) commented Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant