Add Claude Code and Gemini transcript support for context importer by Ankit-Kotnala · Pull Request #159 · XortexAI/XMem

Ankit-Kotnala · 2026-05-08T18:47:00Z

Summary

Fixes #155.

Adds deterministic transcript parsing support for additional /context upload formats:

Claude Code JSONL session transcripts
Claude/Claude Code role-heading exports
Gemini CLI /chat share JSON exports
Gemini CLI /chat share Markdown exports

Also keeps existing Cursor and Antigravity behavior by moving transcript parsing into a shared helper used by both the production memory route and the legacy server entrypoint.

Changes

Added src/utils/transcripts.py as the shared transcript parser module.
Updated /v1/memory/parse_transcript to use the shared parser.
Updated legacy server.py parsing wrapper to use the same shared parser.

Ankit-Kotnala · 2026-05-08T18:47:46Z

@ishaanxgupta Please review this.

gemini-code-assist

Code Review

This pull request centralizes transcript parsing logic into a new utility module, src/utils/transcripts.py, replacing duplicate implementations in server.py and src/api/routes/memory.py. The new shared parser adds support for Claude Code (JSONL), Gemini CLI, and Claude-style markdown exports while implementing logic to filter out tool calls and thinking blocks. Feedback identified several instances where consecutive user messages would be overwritten rather than concatenated, leading to potential data loss. Additionally, a bug was found in the text cleaning utility that would incorrectly strip markdown list markers.

gemini-code-assist · 2026-05-08T18:50:38Z

+def _clean_text(text: str) -> str:
+    return text.strip().strip("-").strip()


The strip("-") call in _clean_text is problematic for Markdown content. It will remove leading bullet points from list items (e.g., - Item becomes Item) and can strip horizontal rules or other intentional formatting. It should be removed to preserve the integrity of the message content.

Suggested change

def _clean_text(text: str) -> str:

return text.strip().strip("-").strip()

def _clean_text(text: str) -> str:

return text.strip()

gemini-code-assist · 2026-05-08T18:50:38Z

+        if role in _USER_ROLES:
+            flush_pair()
+            current_user_query = text
+        elif current_user_query:
+            assistant_chunks.append(text)


This logic discards previous user messages if multiple user turns occur consecutively without an intervening assistant response. In many chat transcripts (especially from CLI tools or when users send multiple fragments), it's better to concatenate consecutive user messages to ensure no context is lost during import.

Suggested change

if role in _USER_ROLES:

flush_pair()

current_user_query = text

elif current_user_query:

assistant_chunks.append(text)

if role in _USER_ROLES:

if assistant_chunks:

flush_pair()

current_user_query = text

else:

current_user_query = (current_user_query + "\n\n" + text) if current_user_query else text

elif current_user_query:

assistant_chunks.append(text)

gemini-code-assist · 2026-05-08T18:50:38Z

+        if role in _USER_ROLES:
+            flush_pair()
+            current_user_query = content
+        elif role in assistant_roles and current_user_query:
+            assistant_chunks.append(content)


Similar to the JSON parser, this role-heading parser also overwrites the current_user_query if multiple user headings are encountered before an assistant response. Concatenating them would prevent data loss.

Suggested change

if role in _USER_ROLES:

flush_pair()

current_user_query = content

elif role in assistant_roles and current_user_query:

assistant_chunks.append(content)

if role in _USER_ROLES:

if assistant_chunks:

flush_pair()

current_user_query = content

else:

current_user_query = (current_user_query + "\n\n" + content) if current_user_query else content

elif role in assistant_roles and current_user_query:

assistant_chunks.append(content)

gemini-code-assist · 2026-05-08T18:50:38Z

+        if section.startswith("**User**"):
+            current_user_query = section.replace("**User**", "", 1).strip()


In the Cursor transcript parser, consecutive user sections will result in the earlier sections being lost. Consider concatenating them to preserve all user input.

if section.startswith("**User**"): content = section.replace("**User**", "", 1).strip() if current_user_query: current_user_query += "\n\n" + content else: current_user_query = content

Add Claude Code and Gemini transcript parsing

f2761bc

Ankit-Kotnala requested review from ishaanxgupta and ved015 as code owners May 8, 2026 18:47

github-actions Bot added the api label May 8, 2026

gemini-code-assist Bot reviewed May 8, 2026

View reviewed changes

Add Claude Code and Gemini transcript parsing

bee0602

Ankit-Kotnala closed this May 8, 2026

Ankit-Kotnala deleted the dev/Ankit branch May 8, 2026 19:02

Ankit-Kotnala restored the dev/Ankit branch May 8, 2026 19:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Claude Code and Gemini transcript support for context importer#159

Add Claude Code and Gemini transcript support for context importer#159
Ankit-Kotnala wants to merge 2 commits intoXortexAI:mainfrom
Ankit-Kotnala:dev/Ankit

Ankit-Kotnala commented May 8, 2026

Uh oh!

Ankit-Kotnala commented May 8, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 8, 2026

Uh oh!

gemini-code-assist Bot May 8, 2026

Uh oh!

gemini-code-assist Bot May 8, 2026

Uh oh!

gemini-code-assist Bot May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		def _clean_text(text: str) -> str:
		return text.strip().strip("-").strip()

		if section.startswith("User"):
		current_user_query = section.replace("User", "", 1).strip()

Conversation

Ankit-Kotnala commented May 8, 2026

Summary

Changes

Uh oh!

Ankit-Kotnala commented May 8, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant