diff --git a/multi-representation-search/multi-representation-search.ipynb b/multi-representation-search/multi-representation-search.ipynb
new file mode 100644
index 0000000..6421baa
--- /dev/null
+++ b/multi-representation-search/multi-representation-search.ipynb
@@ -0,0 +1,561 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "2153bba9",
+   "metadata": {},
+   "source": [
+    "# Multi-Representation Search: Step-by-Step Build-Up\n",
+    "\n",
+    "A document is rarely well-represented by a single embedding. A research paper has a title, an abstract, body chunks, and category tags, each carrying a different signal. Treat all four as one dense vector and the title gets averaged out; chunk-level grounding for downstream reasoning disappears.\n",
+    "\n",
+    "This notebook builds a Qdrant retrieval pipeline that uses each representation deliberately. Over six steps you'll go from a naive dense-only baseline to a fully fused pipeline with four named-vector prefetches, Reciprocal Rank Fusion, document-level grouping, and optional formula-based score boosting. After each step you'll run the same query and see the top retrieved papers change.\n",
+    "\n",
+    "The design rationale (why each component is there, when to use it, when not to) lives in the accompanying [tutorial](https://qdrant.tech/documentation/tutorials-search-engineering/multi-representation-search/). This notebook focuses on running the code and watching the result list shift.\n",
+    "\n",
+    "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://githubtocolab.com/qdrant/examples/blob/master/multi-representation-search/multi-representation-search.ipynb)\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4b597568",
+   "metadata": {},
+   "source": [
+    "## Requirements\n",
+    "\n",
+    "This notebook uses [Qdrant Cloud Inference](https://qdrant.tech/documentation/inference/#qdrant-cloud-inference) to generate embeddings server-side, so no client-side embedding library is required. The free tier covers this notebook's footprint. Core BM25 runs on any Qdrant instance, but dense Cloud Inference is Cloud-only. To self-host, generate dense vectors on the client with a library like [FastEmbed](https://qdrant.tech/documentation/fastembed/) and pass them as raw vectors instead of `models.Document`.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "59028f90",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!pip install qdrant-client datasets"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c1e8c733",
+   "metadata": {},
+   "source": [
+    "## Dataset\n",
+    "\n",
+    "20 000 ML/CS arXiv papers (2018 and later) from the [`gfissore/arxiv-abstracts-2021`](https://huggingface.co/datasets/gfissore/arxiv-abstracts-2021) dataset. Each paper has a `title`, `abstract`, and `categories` (which this dataset returns as space-joined strings, so we split them before filtering).\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ed2823ba",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from datasets import load_dataset\n",
+    "\n",
+    "ML_CATEGORIES = {\"cs.LG\", \"cs.CV\", \"cs.CL\", \"cs.AI\", \"stat.ML\"}\n",
+    "\n",
+    "# Non-streaming so HF caches the parquet locally; first run downloads ~2.5 GB, re-runs are instant.\n",
+    "dataset = load_dataset(\"gfissore/arxiv-abstracts-2021\", split=\"train\")\n",
+    "\n",
+    "papers = []\n",
+    "# IDs are roughly chronological; iterate from the end to land on 2021/2020/2019 papers first.\n",
+    "for i in range(len(dataset) - 1, -1, -1):\n",
+    "    if len(papers) >= 20000:\n",
+    "        break\n",
+    "    row = dataset[i]\n",
+    "    if not row[\"abstract\"] or not row[\"title\"]:\n",
+    "        continue\n",
+    "    # categories arrive as space-joined strings (e.g. [\"cs.LG cs.CV\"]); split each entry.\n",
+    "    cats = [tok for entry in row[\"categories\"] for tok in entry.split()]\n",
+    "    if not any(c in ML_CATEGORIES for c in cats):\n",
+    "        continue\n",
+    "    # Year lives in the YYMM prefix of new-format arXiv IDs (\"2104.01234\" -> 2021).\n",
+    "    arxiv_id = row[\"id\"]\n",
+    "    if \"/\" in arxiv_id or \".\" not in arxiv_id:\n",
+    "        continue  # skip pre-2007 IDs like \"math/0506001\"\n",
+    "    if 2000 + int(arxiv_id[:2]) < 2018:\n",
+    "        continue\n",
+    "    papers.append({\n",
+    "        \"arxiv_id\": arxiv_id,\n",
+    "        \"title\": row[\"title\"].strip(),\n",
+    "        \"abstract\": row[\"abstract\"].strip(),\n",
+    "        \"tags\": cats,\n",
+    "    })\n",
+    "print(f\"Loaded {len(papers)} papers\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "26339a5a",
+   "metadata": {},
+   "source": [
+    "## Schema\n",
+    "\n",
+    "One Qdrant collection. Each point is a chunk. Each chunk holds four named vectors that we'll fuse at query time:\n",
+    "\n",
+    "- `dense_chunk`: the chunk's own embedding (body content).\n",
+    "- `dense_title`: the paper title embedding (topical naming).\n",
+    "- `dense_abstract`: the paper abstract embedding (paper-level view).\n",
+    "- `sparse_title`: BM25 over the title (lexical matches on rare entity names, jargon, specific model or paper names).\n",
+    "\n",
+    "Categories live in the `tags` payload with a keyword index, so queries can pre-filter by category.\n",
+    "\n",
+    "`dense_title`, `dense_abstract`, and `sparse_title` are duplicated across every chunk of the same paper. That trades a bit of storage for one-shot query fusion (one collection, one Query API call, every representation reachable from any point). For the typical case (a few dozen chunks per paper, embeddings under a kilobyte each) it's the simpler choice.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "788e1d18",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from qdrant_client import QdrantClient, models\n",
+    "\n",
+    "# Replace url and api_key with your own from https://cloud.qdrant.io\n",
+    "client = QdrantClient(\n",
+    "    url=\"https://xyz-example.qdrant.io:6333\",\n",
+    "    api_key=\"<your-api-key>\",\n",
+    "    cloud_inference=True,\n",
+    ")\n",
+    "\n",
+    "# 384 is the output dimension of sentence-transformers/all-minilm-l6-v2, used below for every dense vector.\n",
+    "client.create_collection(\n",
+    "    collection_name=\"arxiv_multi_repr\",\n",
+    "    vectors_config={\n",
+    "        \"dense_chunk\":    models.VectorParams(size=384, distance=models.Distance.COSINE),\n",
+    "        \"dense_title\":    models.VectorParams(size=384, distance=models.Distance.COSINE),\n",
+    "        \"dense_abstract\": models.VectorParams(size=384, distance=models.Distance.COSINE),\n",
+    "    },\n",
+    "    sparse_vectors_config={\n",
+    "        \"sparse_title\": models.SparseVectorParams(modifier=models.Modifier.IDF),\n",
+    "    },\n",
+    ")\n",
+    "\n",
+    "# Index 'document_id' so the Query API can group by it; index 'tags' so we can filter on category.\n",
+    "client.create_payload_index(\n",
+    "    collection_name=\"arxiv_multi_repr\",\n",
+    "    field_name=\"document_id\",\n",
+    "    field_schema=models.PayloadSchemaType.KEYWORD,\n",
+    ")\n",
+    "client.create_payload_index(\n",
+    "    collection_name=\"arxiv_multi_repr\",\n",
+    "    field_name=\"tags\",\n",
+    "    field_schema=models.PayloadSchemaType.KEYWORD,\n",
+    ")\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "295e1a01",
+   "metadata": {},
+   "source": [
+    "## Ingestion\n",
+    "\n",
+    "Embeddings are generated server-side via Qdrant Cloud Inference:\n",
+    "\n",
+    "- `sentence-transformers/all-minilm-l6-v2` (384-dim) for the three dense vectors.\n",
+    "- `qdrant/bm25` (core BM25 since Qdrant 1.15) for the sparse vector, with `avg_len=10.0` calibrated for the title-only field (default is 256, calibrated for document-length text).\n",
+    "\n",
+    "Chunking uses a fixed two-sentence window for simplicity; the right chunking strategy depends on your document structure. One point per chunk, with the title and abstract Documents reused across every chunk of the same paper.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "725afca6",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "DENSE_MODEL = \"sentence-transformers/all-minilm-l6-v2\"\n",
+    "BM25_MODEL = \"qdrant/bm25\"\n",
+    "\n",
+    "def chunk_sentences(text, target_len=2):\n",
+    "    \"\"\"Split text into ~2-sentence chunks; fall back to the full text if it doesn't split cleanly.\"\"\"\n",
+    "    sentences = [s.strip() for s in text.split(\". \") if s.strip()]\n",
+    "    return [\". \".join(sentences[i:i + target_len])\n",
+    "            for i in range(0, len(sentences), target_len)] or [text]\n",
+    "\n",
+    "\n",
+    "points = []\n",
+    "for paper in papers:\n",
+    "    chunks = chunk_sentences(paper[\"abstract\"])\n",
+    "\n",
+    "    # Title, abstract, and sparse docs are reused across every chunk of this paper; only the chunk text varies.\n",
+    "    # Cloud Inference embeds each Document on the server, so you don't need a client-side embedding library.\n",
+    "    title_doc    = models.Document(text=paper[\"title\"],    model=DENSE_MODEL)\n",
+    "    abstract_doc = models.Document(text=paper[\"abstract\"], model=DENSE_MODEL)\n",
+    "    # avg_len is the average word count of the indexed text.\n",
+    "    # Default is 256 (document-length); setting it to the actual field length (~10 here) improves BM25 scoring accuracy.\n",
+    "    sparse_doc   = models.Document(\n",
+    "        text=paper[\"title\"],\n",
+    "        model=BM25_MODEL,\n",
+    "        options={\"avg_len\": 10.0},\n",
+    "    )\n",
+    "\n",
+    "    for i, chunk in enumerate(chunks):\n",
+    "        points.append(models.PointStruct(\n",
+    "            id=len(points),\n",
+    "            vector={\n",
+    "                \"dense_chunk\":    models.Document(text=chunk, model=DENSE_MODEL),\n",
+    "                \"dense_title\":    title_doc,\n",
+    "                \"dense_abstract\": abstract_doc,\n",
+    "                \"sparse_title\":   sparse_doc,\n",
+    "            },\n",
+    "            payload={\n",
+    "                \"document_id\": paper[\"arxiv_id\"],\n",
+    "                \"title\":       paper[\"title\"],\n",
+    "                \"tags\":        paper[\"tags\"],\n",
+    "                \"chunk_index\": i,\n",
+    "                \"chunk_text\":  chunk,\n",
+    "            },\n",
+    "        ))\n",
+    "\n",
+    "client.upload_points(collection_name=\"arxiv_multi_repr\", points=points, batch_size=256)\n",
+    "print(f\"Uploaded {len(points)} chunks across {len(papers)} papers\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "61b1aa7b",
+   "metadata": {},
+   "source": [
+    "## Query Helpers\n",
+    "\n",
+    "Two pieces used by every step below:\n",
+    "\n",
+    "- `SAMPLE_QUERY` is the single query we run through every step so we can watch the same query produce different results as capabilities are added.\n",
+    "- `show_results(retrieve_fn)` runs the retrieve function and prints the top 5 results: title, category tags, and an excerpt from the matching chunk. Accepts both chunk-level results (Steps 1-4) and grouped results (Steps 5-6, where each result is a paper with several chunks).\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f70b01f8",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import textwrap\n",
+    "\n",
+    "SAMPLE_QUERY = \"diffusion models for image synthesis\"\n",
+    "\n",
+    "def show_results(retrieve_fn, query=SAMPLE_QUERY, k=5):\n",
+    "    \"\"\"Print top-k results as: title, category tags, and a matching-chunk excerpt.\"\"\"\n",
+    "    print(f\"Query: {query!r}\\n\")\n",
+    "    for i, item in enumerate(retrieve_fn(query, limit=k), 1):\n",
+    "        # item is a Point (Steps 1-4) or a Group (Steps 5-6).\n",
+    "        # For groups, hits[0] is the top chunk for that paper.\n",
+    "        point = item.hits[0] if hasattr(item, \"hits\") else item\n",
+    "        payload = point.payload\n",
+    "        title = payload[\"title\"]\n",
+    "        tags = payload.get(\"tags\", [])\n",
+    "        # Collapse whitespace (including embedded newlines) so the excerpt prints cleanly.\n",
+    "        chunk = \" \".join(payload[\"chunk_text\"].split())\n",
+    "        excerpt = chunk[:250].rstrip() + (\"...\" if len(chunk) > 250 else \"\")\n",
+    "        print(textwrap.fill(f\"{i}. {title}\", width=140, initial_indent=\"  \", subsequent_indent=\"     \"))\n",
+    "        if tags:\n",
+    "            print(f\"     [{', '.join(str(t) for t in tags[:3])}]\")\n",
+    "        print(textwrap.fill(excerpt, width=140, initial_indent=\"     \", subsequent_indent=\"     \"))\n",
+    "        print()\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4b9065fe",
+   "metadata": {},
+   "source": [
+    "## Step 1: Dense Over Chunks (Baseline)\n",
+    "\n",
+    "The naive baseline: encode the query with the dense model, search against `dense_chunk` only, return the chunk-level results' parent papers. No fusion, no title or sparse signal.\n",
+    "\n",
+    "This is what most \"vector search\" tutorials stop at. It's a reasonable default for short, homogeneous corpora where the chunk text already carries the full signal. It systematically underperforms when the signal lives outside the chunk: in the title (topical naming), or in keyword overlap that the embedding model has averaged out into a generic neighborhood.\n",
+    "\n",
+    "Each subsequent step closes one of those gaps.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "566dbbbd",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def retrieve_baseline(query, limit=10):\n",
+    "    return client.query_points(\n",
+    "        collection_name=\"arxiv_multi_repr\",\n",
+    "        query=models.Document(text=query, model=DENSE_MODEL),\n",
+    "        using=\"dense_chunk\",\n",
+    "        limit=limit,\n",
+    "    ).points\n",
+    "\n",
+    "show_results(retrieve_baseline)\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f710ce2f",
+   "metadata": {},
+   "source": [
+    "## Step 2: Add Sparse Title With RRF\n",
+    "\n",
+    "Add a second prefetch: BM25 over the title. Then fuse the two ranked lists with **Reciprocal Rank Fusion (RRF)**.\n",
+    "\n",
+    "Why RRF instead of weighted averages of raw scores? RRF works on rank, not score. Dense scores live in [0, 1], sparse BM25 scores don't, and RRF doesn't have to reconcile the two. Linear weights are fragile: a weight that helps one query class hurts another, and the right weight depends on query length, model, and corpus.\n",
+    "\n",
+    "What does sparse add? Queries with rare entity names, jargon, or specific model/paper names often produce dense embeddings near generic neighborhoods. The sparse path catches those exact-token matches on the title. RRF promotes documents both paths agree on.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "44b0f157",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def retrieve_hybrid(query, limit=10):\n",
+    "    dense_query  = models.Document(text=query, model=DENSE_MODEL)\n",
+    "    sparse_query = models.Document(text=query, model=BM25_MODEL)\n",
+    "    return client.query_points(\n",
+    "        collection_name=\"arxiv_multi_repr\",\n",
+    "        prefetch=[\n",
+    "            models.Prefetch(query=dense_query,  using=\"dense_chunk\", limit=50),\n",
+    "            models.Prefetch(query=sparse_query, using=\"sparse_title\", limit=50),\n",
+    "        ],\n",
+    "        query=models.FusionQuery(fusion=models.Fusion.RRF),\n",
+    "        limit=limit,\n",
+    "    ).points\n",
+    "\n",
+    "show_results(retrieve_hybrid)\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4bdf38f7",
+   "metadata": {},
+   "source": [
+    "## Step 3: Add Title Prefetch\n",
+    "\n",
+    "Add a third prefetch: the same dense query vector, but searched against `dense_title` instead of `dense_chunk`. We're now fusing across three representations: chunk content, title (lexical), and title (semantic).\n",
+    "\n",
+    "The title prefetch saves queries where the topic is named explicitly but not echoed in any single chunk. For example: \"diffusion models for high-resolution image synthesis\" surfaces a paper titled \"High-Resolution Image Synthesis with Latent Diffusion Models\" via the title path even when its chunks phrase the contribution differently. The chunk prefetch alone misses it; the title path catches it; RRF promotes it because both paths agree.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b62d81a9",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def retrieve_three_repr(query, limit=10):\n",
+    "    dense_query  = models.Document(text=query, model=DENSE_MODEL)\n",
+    "    sparse_query = models.Document(text=query, model=BM25_MODEL)\n",
+    "    return client.query_points(\n",
+    "        collection_name=\"arxiv_multi_repr\",\n",
+    "        prefetch=[\n",
+    "            models.Prefetch(query=dense_query,  using=\"dense_chunk\", limit=50),\n",
+    "            models.Prefetch(query=dense_query,  using=\"dense_title\", limit=50),\n",
+    "            models.Prefetch(query=sparse_query, using=\"sparse_title\", limit=50),\n",
+    "        ],\n",
+    "        query=models.FusionQuery(fusion=models.Fusion.RRF),\n",
+    "        limit=limit,\n",
+    "    ).points\n",
+    "\n",
+    "show_results(retrieve_three_repr)\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e59ce67e",
+   "metadata": {},
+   "source": [
+    "## Step 4: Add Abstract Prefetch\n",
+    "\n",
+    "Add a fourth prefetch on `dense_abstract`. The abstract gives a paper-level view that sits between the title (very short) and individual chunks (very local). It catches queries that match the paper's overall framing rather than a single passage or the title's topical naming.\n",
+    "\n",
+    "In a production setup where chunks are full paper bodies, the abstract is a meaningfully different representation. In this notebook's arXiv dataset (where chunks are 2-sentence slices of the abstract itself), the lift over Step 3 will be smaller because the abstract and the chunks share text. The prefetch is still worth wiring up; the pipeline shape is what generalizes to longer corpora.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e9c0dd1d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def retrieve_four_repr(query, limit=10):\n",
+    "    dense_query  = models.Document(text=query, model=DENSE_MODEL)\n",
+    "    sparse_query = models.Document(text=query, model=BM25_MODEL)\n",
+    "    return client.query_points(\n",
+    "        collection_name=\"arxiv_multi_repr\",\n",
+    "        prefetch=[\n",
+    "            models.Prefetch(query=dense_query,  using=\"dense_chunk\",    limit=50),\n",
+    "            models.Prefetch(query=dense_query,  using=\"dense_title\",    limit=50),\n",
+    "            models.Prefetch(query=dense_query,  using=\"dense_abstract\", limit=50),\n",
+    "            models.Prefetch(query=sparse_query, using=\"sparse_title\",   limit=50),\n",
+    "        ],\n",
+    "        query=models.FusionQuery(fusion=models.Fusion.RRF),\n",
+    "        limit=limit,\n",
+    "    ).points\n",
+    "\n",
+    "show_results(retrieve_four_repr)\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1fed2f91",
+   "metadata": {},
+   "source": [
+    "## Step 5: Group by Document\n",
+    "\n",
+    "So far results are chunks, and the same paper can appear multiple times in the top 10. Most consumers want one entry per document with the top chunks attached: a results UI, a citation list, an LLM that needs document-level attribution.\n",
+    "\n",
+    "`query_points_groups` collapses chunks back to documents using `group_by=\"document_id\"`. Each group's `hits` field carries the top-`group_size` chunks for that paper.\n",
+    "\n",
+    "This step also wires in an optional `tags` parameter that filters candidates to specific arXiv categories before retrieval runs. Qdrant pre-filters on the payload index we added in the schema, so filtering happens before the fusion math, not after.\n",
+    "\n",
+    "A few things worth knowing:\n",
+    "\n",
+    "- Grouping is a *presentation* choice, not a relevance technique. The candidates and their fused scores don't change; only the result shape does.\n",
+    "- You may need to adjust the per-prefetch `limit` based on the number of chunks per document; grouping only sees what the prefetch returns.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1694ce42",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def retrieve_grouped(query, limit=10, group_size=3, tags=None):\n",
+    "    dense_query  = models.Document(text=query, model=DENSE_MODEL)\n",
+    "    sparse_query = models.Document(text=query, model=BM25_MODEL)\n",
+    "    # Optional category filter. When tags is provided, Qdrant pre-filters candidates\n",
+    "    # to points whose 'tags' payload includes any of the given values.\n",
+    "    query_filter = (\n",
+    "        models.Filter(must=[models.FieldCondition(key=\"tags\", match=models.MatchAny(any=tags))])\n",
+    "        if tags else None\n",
+    "    )\n",
+    "    # query_points_groups runs the prefetches, fuses with RRF, applies the filter, and groups results by document_id.\n",
+    "    return client.query_points_groups(\n",
+    "        collection_name=\"arxiv_multi_repr\",\n",
+    "        prefetch=[\n",
+    "            models.Prefetch(query=dense_query,  using=\"dense_chunk\",    limit=100),\n",
+    "            models.Prefetch(query=dense_query,  using=\"dense_title\",    limit=100),\n",
+    "            models.Prefetch(query=dense_query,  using=\"dense_abstract\", limit=100),\n",
+    "            models.Prefetch(query=sparse_query, using=\"sparse_title\",   limit=100),\n",
+    "        ],\n",
+    "        query=models.FusionQuery(fusion=models.Fusion.RRF),\n",
+    "        query_filter=query_filter,\n",
+    "        group_by=\"document_id\",\n",
+    "        group_size=group_size,\n",
+    "        limit=limit,\n",
+    "    ).groups\n",
+    "\n",
+    "show_results(retrieve_grouped)\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "83c7905e",
+   "metadata": {},
+   "source": [
+    "## Step 6: Score Boosting With a Formula\n",
+    "\n",
+    "When you have ranking preferences that aren't captured by similarity alone (recency, source authority, geographic proximity, structured boosts), swap RRF for a `FormulaQuery`. Formulas operate on the prefetch scores and payload fields:\n",
+    "\n",
+    "- `$score[i]` references the score from prefetch `i`. Prefetch order is load-bearing.\n",
+    "- The `defaults` map provides fallback values for candidates that didn't appear in every prefetch, so the formula still evaluates.\n",
+    "\n",
+    "The formula below sums the chunk score with weighted contributions from the title, abstract, and sparse prefetches. This is a linear combination of raw scores, which breaks down when prefetches use different scoring scales. RRF avoids this by discarding scores; DBSF normalizes per prefetch; a custom formula has to align distributions itself, typically with [decay functions](https://qdrant.tech/documentation/search/search-relevance/#decay-functions). The full FormulaQuery syntax lives in the [Score Boosting](https://qdrant.tech/documentation/search/search-relevance/#score-boosting) reference.\n",
+    "\n",
+    "For time-based decay on a `published_at` payload field, swap a term for an `exp_decay` expression.\n",
+    "\n",
+    "For RRF vs. DBSF guidance, see the [hybrid-search FAQ](https://qdrant.tech/documentation/faq/qdrant-fundamentals/#when-should-i-use-reciprocal-rank-fusion-rrf-vs-distribution-based-score-fusion-dbsf-for-hybrid-search).\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d25beee5",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def retrieve_boosted(query, limit=10, group_size=3):\n",
+    "    dense_query  = models.Document(text=query, model=DENSE_MODEL)\n",
+    "    sparse_query = models.Document(text=query, model=BM25_MODEL)\n",
+    "    return client.query_points_groups(\n",
+    "        collection_name=\"arxiv_multi_repr\",\n",
+    "        prefetch=[\n",
+    "            # $score[0] = chunk, $score[1] = title, $score[2] = abstract, $score[3] = sparse\n",
+    "            models.Prefetch(query=dense_query,  using=\"dense_chunk\",    limit=100),\n",
+    "            models.Prefetch(query=dense_query,  using=\"dense_title\",    limit=100),\n",
+    "            models.Prefetch(query=dense_query,  using=\"dense_abstract\", limit=100),\n",
+    "            models.Prefetch(query=sparse_query, using=\"sparse_title\",   limit=100),\n",
+    "        ],\n",
+    "        query=models.FormulaQuery(\n",
+    "            formula=models.SumExpression(sum=[\n",
+    "                models.MultExpression(mult=[1.0, \"$score[0]\"]),\n",
+    "                models.MultExpression(mult=[0.5, \"$score[1]\"]),\n",
+    "                models.MultExpression(mult=[0.4, \"$score[2]\"]),\n",
+    "                models.MultExpression(mult=[0.3, \"$score[3]\"]),\n",
+    "            ]),\n",
+    "            defaults={\"$score[1]\": 0.0, \"$score[2]\": 0.0, \"$score[3]\": 0.0},\n",
+    "        ),\n",
+    "        group_by=\"document_id\",\n",
+    "        group_size=group_size,\n",
+    "        limit=limit,\n",
+    "    ).groups\n",
+    "\n",
+    "show_results(retrieve_boosted)\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ca1e7741",
+   "metadata": {},
+   "source": [
+    "## Wrap-up\n",
+    "\n",
+    "That's the recommended multi-representation pipeline end to end. The same schema works for any corpus with title-like, abstract-like, and body-like representations.\n",
+    "\n",
+    "If you ran this notebook with the same `SAMPLE_QUERY` (\"diffusion models for image synthesis\") and the same 20,000-paper arXiv slice, here's roughly what each step's top 5 should produce:\n",
+    "\n",
+    "- **Step 1 (`dense_chunk` only):** chunk-level results with the same paper appearing in multiple slots. SegDiff, LDM, GLIDE in the top 5.\n",
+    "- **Step 2 (+ `sparse_title`):** title-exact matches surface. Vector Quantized Diffusion Model jumps in.\n",
+    "- **Step 3 (+ `dense_title`):** LDM dominates with three of its own chunks. Semantic title match takes over.\n",
+    "- **Step 4 (+ `dense_abstract`):** modest shift. GLIDE returns thanks to abstract-level signal. Adding a prefetch isn't always dramatic.\n",
+    "- **Step 5 (grouping):** one entry per paper. The collapsed LDM chunks free up slots for Palette, Global Context, and Implicit Image Segmentation.\n",
+    "- **Step 6 (formula):** custom weighting reorders results. Vector Quantized Diffusion climbs back; ImageBART and Manifold-aware Synthesis enter as the formula amplifies raw scores differently from RRF's rank-based fusion.\n",
+    "\n",
+    "Swap the dataset, retune which representations earn their prefetch slots for your data, and wire in formula-based ranking preferences as needed.\n",
+    "\n",
+    "For the design rationale and references, see the [tutorial](https://qdrant.tech/documentation/tutorials-search-engineering/multi-representation-search/).\n"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": ".venv",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.12.13"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}