Skip to content

fix: multi-language support and web search citations (#104)#125

Open
bhavyakeerthi3 wants to merge 3 commits intoreactome:mainfrom
bhavyakeerthi3:fix/multi-lang
Open

fix: multi-language support and web search citations (#104)#125
bhavyakeerthi3 wants to merge 3 commits intoreactome:mainfrom
bhavyakeerthi3:fix/multi-lang

Conversation

@bhavyakeerthi3
Copy link
Copy Markdown

🌟 Summary

This PR addresses Issue #104 ("RAG only responds in English") and completes the integration of external web search results. It enables the chatbot to act as a globally accessible research assistant by handling cross-lingual queries with high scientific precision.


Key Improvement: Multi-Language Support (#104)

The Problem: The agent was hardcoded to English in rephrasing steps and often failed to deliver the final response in the user's native language, even when correctly detected.

The Solution:

  • Enforced Target Language: Updated the summarize_reactome_uniprot system prompt to strictly prioritize the {detected_language} for the final response.
  • Scientific Precision: Added explicit instructions to preserve exact biological terminology (gene names, pathway IDs) during translation to ensure the expert summary remains accurate.
  • Cross-Lingual Strategy: Internal query rephrasing remains in English to maximize vector search accuracy in the primarily English Reactome/UniProt databases, while the final node handles the translation back to the user.

Key Improvement: Web Search Integration

The Problem: While the graph logic for the Tavily fallback was fixed previously, the search results were not being piped to the final summarizer node.

The Solution:

  • Wired Data Flow: Updated generate_final_response in cross_database.py to correctly map and pass web_results to the summarization chain.
  • Citations: Updated the summarizer to cite and link external web sources clearly alongside Reactome and UniProt hits, providing a "comprehensive and insightful" fallback experience.

Files Modified

File Change
src/agent/tasks/cross_database/summarize_reactome_uniprot.py Enforced output language & cited web results
src/agent/profiles/cross_database.py Wired web_results data flow to the final node
src/agent/tasks/rephrase.py Clarified the "English-for-Search" strategy

Verification

Verified the logic flow of the LangGraph. The agent now successfully handles the lifecycle: Multi-lingual Input → Precision English Search → Multi-DB Research → Native Language Summary with Citations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant