Add FlashRank reranker to HybridRetriever to improve retrieval quality by GovindhKishore · Pull Request #116 · reactome/reactome_chatbot

GovindhKishore · 2026-03-01T09:03:09Z

Summary

Adds a reranking layer to HybridRetriever in csv_chroma.py to address the issue of responses becoming increasingly long and noisy as more data sources are integrated into the retrieval pipeline.

Problem

The current pipeline retrieves documents from multiple subdirectories using BM25 + SelfQuery + MultiQuery expansion, resulting in ~90 documents being passed directly to create_stuff_documents_chain.

There is no cross-subdirectory relevance filtering - all retrieved documents are stuffed into the LLM prompt regardless of how relevant they are to the original user query. This causes:

Responses becoming longer and noisier as more data is added
Low-relevance documents from one subdirectory treated equally to high-relevance documents from another
LLM receiving too much context which reduces answer precision

Solution

A new module src/retrievers/reranker.py is introduced using FlashRank (ms-marco-MiniLM-L-12-v2). After weighted_reciprocal_rank merges results across all subdirectories, the reranker scores every retrieved document against the original user query using a cross-encoder model and returns only the top N most relevant documents.

Two functions are provided:

rerank() - sync, called by retrieve_documents()
arerank() - async, called by aretrieve_documents()

arerank() uses asyncio.to_thread to run the blocking FlashRank inference in a background thread without freezing the async event loop.

Changes

src/retrievers/reranker.py - new module containing reranking logic
src/retrievers/csv_chroma.py - import reranker, update return statements in both retrieve_documents() and aretrieve_documents()
config_default.yml - add reranker configuration block
pyproject.toml / poetry.lock - add flashrank dependency

Why FlashRank

Runs locally - no API key required
CPU only - no GPU needed
Lightweight (~4MB model)
No changes to downstream pipeline - same list[Document] type
returned throughout

Impact

Since csv_chroma.py is shared by both Reactome and UniProt retrievers, reranking applies automatically to all current and future database integrations without any additional changes.

Test

# Input: 7 documents (mix of relevant and irrelevant)
# Query: "What does TP53 do in apoptosis?"

# Output after reranking (top 3):
# 1. score=0.9996 | TP53 activates apoptosis through BAX
# 2. score=0.9860 | TP53 and PUMA in intrinsic apoptosis  
# 3. score=0.8930 | p53 regulates cell death signalling

# Correctly dropped:
# RNA polymerase II transcription      (irrelevant)
# Reactome database overview           (irrelevant)
# General cancer pathway summary       (irrelevant)

Note

This contribution was developed with AI assistance (Claude) for understanding the codebase and implementation guidance. All code has been reviewed and understood.

Closes #115

Happy to make any changes based on maintainer feedback.

GovindhKishore · 2026-03-02T11:31:59Z

Hi @adamjohnwright @GFJHogue ,

Just flagging this PR for your attention when you get a chance. This directly addresses the retrieval noise issue mentioned across several issues, and since it touches csv_chroma.py which is shared by both Reactome and UniProt retrievers, I wanted to make sure the right people are aware of it.

Happy to:

Add unit tests if needed
Adjust the top_n default value in config_default.yml
Discuss alternative reranking models if FlashRank is not preferred

Looking forward to any feedback!

adamjohnwright · 2026-03-02T13:50:06Z

@heliamoh are you able to take a look to see if this resolves the issue(s)?

Add FlashRank reranker to HybridRetriever to improve retrieval quality

cad3267

This was referenced Mar 4, 2026

[Feature] Reduce Response Verbosity by Improving System Prompt Precision #117

Open

Improve response precision by refining system prompts for Reactome and UniProt retrievers #121

Open

This was referenced Mar 9, 2026

feat: HybridRetriever passes noisy, low-relevance documents to the LLM #132

Open

feat: add EmbeddingsFilter contextual compression to HybridRetriever … #133

Open

This was referenced Mar 10, 2026

fix(retriever): HybridRetriever passes unlimited documents to LLM with no token budget #138

Open

feat(retriever): add token-aware context truncation to cap documents passed to LLM #139

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add FlashRank reranker to HybridRetriever to improve retrieval quality#116

Add FlashRank reranker to HybridRetriever to improve retrieval quality#116
GovindhKishore wants to merge 1 commit intoreactome:mainfrom
GovindhKishore:feature/flashrank-reranking

GovindhKishore commented Mar 1, 2026

Uh oh!

GovindhKishore commented Mar 2, 2026

Uh oh!

adamjohnwright commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

GovindhKishore commented Mar 1, 2026

Summary

Problem

Solution

Changes

Why FlashRank

Impact

Test

Note

Uh oh!

GovindhKishore commented Mar 2, 2026

Uh oh!

adamjohnwright commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants