Skip to content

feat: add LLM-based query router for RAG, MCP search, and MCP analysis#142

Open
GovindhKishore wants to merge 4 commits intoreactome:mainfrom
GovindhKishore:feat/mcp-query-routing
Open

feat: add LLM-based query router for RAG, MCP search, and MCP analysis#142
GovindhKishore wants to merge 4 commits intoreactome:mainfrom
GovindhKishore:feat/mcp-query-routing

Conversation

@GovindhKishore
Copy link
Copy Markdown

Summary

Adds LLM-based query routing to ReactToMeGraphBuilder so questions are directed to the correct retrieval path before any tool calls are made.

Closes #141

Context

PR #127 and #137 introduced MCP tools into the agent. Without routing, every question goes through the tool calling loop even when the vector database already has the answer. This PR adds a lightweight classification step that runs before retrieval and directs each question to the right path.

What Changed

src/mcp/query_router.py (new)

  • create_query_router(llm) returns an async route() function
  • Classifies questions into rag, mcp_search, or mcp_analysis
  • Uses a structured prompt with explicit criteria for each category
  • Falls back to rag on unexpected output
  • Intended to be called with a lightweight model (gpt-4o-mini)

src/agent/profiles/react_to_me.py (modified)

  • Two route-specific LLM instances built at init time - llm_with_search_tools
    and llm_with_analysis_tools - each bound to the relevant tool subset only
  • call_model now calls the router first and branches accordingly:
    • rag - existing RAG path, no MCP involved
    • mcp_search - LLM with search/lookup tools only
    • mcp_analysis - LLM with analyze_identifiers only
  • When MCP_SERVER_PATH is not set, routing is skipped entirely and existing
    RAG behaviour is unchanged

Routing Logic

rag - general knowledge questions answerable from static embeddings
mcp_search - live lookup by identifier, name, species list, or db metadata
mcp_analysis - pathway enrichment on an explicit list of gene/protein identifiers
               (requires both analysis intent AND a list - a list alone routes to rag)

Testing

Manual testing against a running MCP server is pending.

The routing logic has been reviewed against the prompt criteria for the following question types:

  • General knowledge questions → rag
  • Lookup by identifier or name → mcp_search
  • Enrichment analysis on a gene list → mcp_analysis
  • Ambiguous or unclear questions → rag fallback

Notes

  • Router uses gpt-4o-mini separately from the main generation model to keep
    classification cost low
  • Tool subset filtering happens at init time via bind_tools - no rebinding
    per message
  • Routing is only active when MCP_SERVER_PATH is set - zero impact on
    deployments without MCP

AI assistance was used in drafting and implementation. All changes reviewed and verified
by me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: route questions to RAG or MCP tools based on query intent

1 participant