Skip to content

fix: remove hard-coded absolute paths from evaluator#113

Open
AaryanCode69 wants to merge 1 commit intoreactome:mainfrom
AaryanCode69:fix/evaluator-hardcoded-paths
Open

fix: remove hard-coded absolute paths from evaluator#113
AaryanCode69 wants to merge 1 commit intoreactome:mainfrom
AaryanCode69:fix/evaluator-hardcoded-paths

Conversation

@AaryanCode69
Copy link
Copy Markdown

Summary

Remove hard-coded absolute paths from the evaluation script and make them configurable via CLI arguments, with automatic fallback to the project's existing EmbeddingEnvironment resolver.

Problem

src/evaluation/evaluator.py contained two hard-coded absolute paths pointing to a specific developer's local machine:

# Line 74 — CSV path for BM25 retrieval
loader = CSVLoader(
    "/Users/hmohammadi/Desktop/react_to_me_github/reactome_chatbot/embeddings/.../summations.csv"
)

# Line 182 — ChromaDB embeddings directory
embeddings_directory = "/Users/hmohammadi/Desktop/react_to_me_github/reactome_chatbot/embeddings/.../summations"

This made it impossible to:

  • Run the evaluator on any other machine
  • Use it in CI/CD pipelines
  • Containerize the evaluation workflow with Docker

Solution

Replace the hard-coded paths with two new CLI arguments that integrate with the project's existing EmbeddingEnvironment utility:

# Zero-config: auto-resolves from embeddings/current
python evaluator.py --testset_dir ./example --rag_type advanced

# Explicit override when needed
python evaluator.py --testset_dir ./example --rag_type advanced \
  --embeddings_dir ./embeddings/openai/text-embedding-3-large/reactome/Release90/summations \
  --csv_path ./embeddings/openai/text-embedding-3-large/reactome/csv_files/summations.csv

Path Resolution Logic

--embeddings_dir provided?
  ├── Yes → use it directly
  └── No  → EmbeddingEnvironment.get_dir("reactome") / "summations"

--csv_path provided?
  ├── Yes → use it directly
  └── No  → <parent of embeddings_dir> / "csv_files" / "summations.csv"

Both paths validated for existence before proceeding.

Changes

File Change
src/evaluation/evaluator.py Removed 2 hard-coded absolute paths; added --embeddings_dir and --csv_path CLI args; added _resolve_paths() helper with EmbeddingEnvironment fallback and existence validation; updated initialize_rag_chain_with_memory() signature to accept csv_path

Design Decisions

  • CLI arguments over environment variables — consistent with the script's existing argparse interface (--testset_dir, --model, --rag_type) and the companion test_generator.py which also uses argparse.
  • Fallback to EmbeddingEnvironment — reuses the project's existing path resolution utility (src/util/embedding_environment.py) rather than inventing a new mechanism. This is the same resolver used by the main application at runtime.
  • Both args optional — enables zero-config usage for developers who have already run embeddings_manager use, while CI and Docker can pass explicit paths.
  • Existence validation_resolve_paths() raises FileNotFoundError with a clear message before any expensive initialization runs.

Backward Compatibility

  • All existing CLI arguments remain unchanged.
  • The script runs identically to before when explicit --embeddings_dir and --csv_path are provided matching the old hard-coded values.
  • No changes to any other files in the repository.

Related Issue

Resolves #109

The evaluator script contained two hard-coded absolute paths pointing
to a specific developer's local machine, making it impossible to run
on other machines, in CI, or inside Docker containers.

- Add --embeddings_dir and --csv_path CLI arguments for explicit control
- Fall back to EmbeddingEnvironment.get_dir() for zero-config usage
- Add _resolve_paths() helper with existence validation
- Remove all absolute path references from the file

Resolves reactome#109
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

refactor: make evaluator paths configurable via CLI arguments

1 participant