Multiple failures with small LLM (Ollama)

## `SelfQueryRetriever` causing syntax crash and `completeness_grader` throwing `AttributeError`

- OS: Windows 11 (WSL 2, Ubuntu)
- Embedding model: sentence-transformers/all-MiniLM-L6-v2 model
- Embeddings dataset version: Reactome Release 96 
- GraphDB version: reactome/graphdb:latest (Release 96)
- Generation backend: Ollama (currently: qwen2.5:0.5b)

## BUGS:
### A. **`AttributError` in `completeness_grader.py`**

This is an unhandled exception in the `assess_completeness` node of the external_search workflow. When the LLM output doesn't stick to the `CompletenessGrade` schema, the `with_structured_output()` returns `None` for the `result` instead of raising. The chatbot is already fully streamed to the UI before `postprocess()` though, regardless of the crash. What breaks is the Tavily web search fallback.

**Error log:**

```bash
biochat_chainlit  |   File "/app/src/external_search/completeness_grader.py", line 50, in ainvoke
biochat_chainlit  |     return {"external_search": result.binary_score}
biochat_chainlit  |                                ^^^^^^^^^^^^^^^^^^^
biochat_chainlit  | AttributeError: 'NoneType' object has no attribute 'binary_score'
biochat_chainlit  | During task with name 'assess_completeness' and id 'cfcc2097-33e8-405a-c0a7-439fda1f79d0' 
```

**Cause:**

1. Wrong format. Small LLMs return plain text or malformed JSON
2. If generation exceeds the context window, it might produce output that cannot be parsed.

### B. **`SelfQueryRetriever` Syntax Crash.**

Wrong syntax generation causing error.

**Error log:**

<img width="1705" height="820" alt="Image" src="https://github.com/user-attachments/assets/af1ab85e-5b7b-476e-bb6c-aa93cb0aae84" />

**Cause:**

Small models generate Python-style boolean expressions `( and )` instead of the correct syntax `(and_())`. So the parser raises `Unexpected token` error. 

## FIX:

Both of the bugs have been investigates and fixes are ready as a PR.

### BUG 1:

Adding a three-tier fallback:
- structured output
- raw text parsing and extracting the grade state
- Hard default: "No"

"No" is used as the hard default as it triggers web search rather than skipping it altogether if the grader state is unknown. Cost of false "No" is one unnecessary Tavily call.

### BUG 2:

Add keyword arguments to `chain_kwargs` inside `SelfQueryRetriever.from_llm()`, this will help to reduce the generation of invalid components. But this is not guaranteed for all malformed filters. Besides prompt modification or adding solid examples in the functional format that the model can follow can help: `and_(eq("gene", "TP53"), eq("synonyms_geneName", "degradation"))`. 

**Note:** This won't be a problem with larger models. A similar issue was discussed in this Github issue - https://github.com/langchain-ai/langchain/issues/9368

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple failures with small LLM (Ollama) #160

`SelfQueryRetriever` causing syntax crash and `completeness_grader` throwing `AttributeError`

BUGS:

A. `AttributError` in `completeness_grader.py`

B. `SelfQueryRetriever` Syntax Crash.

FIX:

BUG 1:

BUG 2:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Multiple failures with small LLM (Ollama) #160

Description

SelfQueryRetriever causing syntax crash and completeness_grader throwing AttributeError

BUGS:

A. AttributError in completeness_grader.py

B. SelfQueryRetriever Syntax Crash.

FIX:

BUG 1:

BUG 2:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`SelfQueryRetriever` causing syntax crash and `completeness_grader` throwing `AttributeError`

A. `AttributError` in `completeness_grader.py`

B. `SelfQueryRetriever` Syntax Crash.