This repository builds the FunML course website from a lecture-notes zip
(produced from Overleaf) plus a small set of tracked website source files.
Everything else — the rendered lectures, the search index, the per-lecture
exercise pages, etc. — is generated by buildsite.py.
Tracked website source files (must exist in the repo root before building):
.gitignore
README.md
buildsite.py
index.html ← landing-page template (top nav, sidebar, viewer)
styles.css ← landing-page styles
script.js ← landing-page JS (sidebar, search, demo back-button)
assets/media_resources.json ← slides + video URLs per lecture
assets/slides/ ← local slide PDFs (named LectureN.pdf)
assets/notebooks/ ← interactive demo HTML/ipynb files + embed_map.json
assets/demos.html ← global demos page (filterable per-lecture)
assets/disclaimer.html
Local-only build inputs (gitignored):
source/FunML_Sp_26_LectNotes.zip ← the Overleaf zip
source/raw/ ← extracted tex sources
pip install beautifulsoup4
pip install "jupyterlite-core[lab]" jupyterlite-pyodide-kernel
# Also install: python3, pandocmkdir -p source/raw
unzip -q source/FunML_Sp_26_LectNotes.zip -d source/raw
find source/raw -type f \( -iname "*in-class exercise*.tex" -o -iname "*in class exercise*.tex" \) -delete
python3 buildsite.py --src source/raw --out .python3 -m http.server 8000 # → http://localhost:8000git add -A
git commit -m "Update FunML website"
git push origin main| Path | Purpose | Edit by hand? |
|---|---|---|
index.html |
Landing-page template. The build modifies the .lecture-list and .sidebar-caption in place; everything else (top nav, title, logo, search bar) is hand-edited. |
Yes |
styles.css |
Styles for the landing page only. | Yes |
script.js |
Landing-page JS: sidebar clicks, video dropdown, demos back-button, full-text search. | Yes |
buildsite.py |
The build script. Houses constants like EXCLUDED_TEX_PATH_PARTIALS and MEDIA_KEY_TITLE_OVERRIDES, plus all rendering logic. |
Yes |
assets/media_resources.json |
Maps each lecture key (e.g. Lecture3) → slide URL and YouTube recording URLs. |
Yes |
assets/notebooks/embed_map.json |
Rules that tell the build where to embed interactive notebook iframes inside compiled lecture pages. | Yes |
assets/notebooks/*.html, *.ipynb |
The actual interactive demo files referenced by embed_map.json. |
Yes (drop in new ones) |
assets/slides/LectureN.pdf |
Local slide decks. Looked up by lecture key. | Yes |
assets/demos.html |
The global demos page, with per-lecture filtering via ?lecture=N. |
Yes |
source/raw/FunML_LN_*/L{N}_webpage_*.tex |
Lecture sources extracted from the zip. | Edit only if you intend to override the upstream — they get overwritten on the next zip extract. |
lectures/, assets/exercises/, assets/style.css, assets/search-index.json, lectures/img/ |
Generated. Do not hand-edit; they get clobbered by buildsite.py. |
No |
-
Drop the new zip at
source/FunML_Sp_26_LectNotes.zip(overwrite the old one). -
Re-extract — protecting any locally modified tex files first if you want to keep them:
# Optional: back up local edits before extract clobbers them # cp source/raw/FunML_L9_Regularization/L9_webpage_Regularization.tex /tmp/keep.tex cd source/raw && unzip -o ../FunML_Sp_26_LectNotes.zip \ "FunML_L*" "img/*" "AIFirst_FunML_notations.tex" \ -x "Lecture*" "LectureXX*" "Supp*" cd .. # Restore protected files if needed # cp /tmp/keep.tex source/raw/FunML_L9_Regularization/L9_webpage_Regularization.tex
The
-x "Lecture*"exclusions skip duplicateLecture19/,Lecture20/, etc. directories in the zip that conflict with the canonicalFunML_L*/versions. -
Rebuild:
python3 buildsite.py --src source/raw --out . -
Inspect
lectures/, the rendered titles inindex.html(sidebar), and the displayed lecture numbering. Filenames will reshuffle if any titles changed.
Heads-up: Files in source/raw/ are gitignored. The site only ships the built
HTML in lectures/. So you must re-extract the zip on any clean clone before building.
No code changes required — the build is convention-driven.
- Create a new directory
source/raw/FunML_L<N>_<topic>/. - Drop in a
L<N>_webpage_<topic>.texfile with a\lecture{<N>}{<Title>}{...}{...}line. (The build also accepts non-webpagetex files as a fallback.) - Rebuild. The lecture is auto-assigned a sequential display number based on its sort position (see How the Build Works).
Display order within the same source L<N> is alphabetical by directory name.
Example for splitting old "L9: Regularization & Performance Metrics" into two:
source/raw/FunML_L9_Regularization/ ← keeps the original L-number
source/raw/FunML_L9_ext_PerformanceMetrics/ ← extension; sorts later than the main file
'e' < 'r', so ext_PerformanceMetrics actually comes first alphabetically.
If you need a specific order, name the directories so they sort that way (e.g.,
prefix the extension with z to push it last, or use a_ and b_ prefixes).
Edit EXCLUDED_TEX_PATH_PARTIALS in buildsite.py:
EXCLUDED_TEX_PATH_PARTIALS = {
"lect1213_extra",
"funml_l12_l13_ext",
"funml_l12_gmmclustering", # ← add a substring (lowercased) of the directory you want skipped
}Any source file whose lowercased path contains one of these substrings is skipped.
index.html is a template — buildsite.py only swaps in the lecture-list and the
sidebar caption. Everything else is yours to edit directly:
| What you can edit | Where in index.html |
|---|---|
| Page title (browser tab) | <title>...</title> line ~7 |
| Brand block (logo, course title, subtitle) | <header class="topbar"><div class="brand">…</div> ~14–21 |
| Top nav buttons & dropdowns (Slides, Lecture Notes, Videos, Exercises, Handouts, Demos, Disclaimer) | <nav class="nav">…</nav> ~22–58. Dropdown menus are populated by script.js from assets/media_resources.json. |
| Global search bar | <div class="global-search">…</div> — visual style in styles.css, behavior in script.js (runSearch). |
| Sidebar structure (e.g., re-add module grouping) | Currently the build emits a flat list of <a class="lecture-item">…</a> items. To group by module, modify sync_portal_index in buildsite.py. There's a previous implementation in the git history (git log --grep=module) you can cherry-pick. |
| Initial lecture loaded in iframe | <iframe id="lecture-frame" src="lectures/Lecture1_FunML-…"> |
- Landing page styles: edit
styles.cssdirectly — it's served as-is. - Per-lecture styles (the iframed lecture HTML): the small CSS bundle baked
into each lecture page is the
CSS = """..."""constant near the top ofbuildsite.py. Edit and rebuild.
Demos are wired by rules in assets/notebooks/embed_map.json.
Each rule embeds a notebook after the matched section heading or in place of the matched figure.
Two ways to anchor a demo to a lecture section:
// (b) Anchor by section id: insert the demo right after the matching <h1 id="..."> heading.
{
"lecture_match": "Lecture31_Large-Language-Models", // substring match against the lecture filename
"section_id": "parameter-efficient-fine-tuning-with-lora",
"notebook_html": "Lecture28_LoRA_Interactive_interactive.html",
"section_title": "Interactive Demo — LoRA",
"description": "...",
"iframe_height": 760
}Two content sources for the iframe:
| Field | Use when |
|---|---|
notebook_html |
The demo is a local file in assets/notebooks/. Iframes always work. Optional notebook_ipynb adds a "View notebook" link to JupyterLite. |
external_url |
The demo lives on another site. If the site allows iframing (e.g. Cloudflare R2), the iframe loads inline. |
external_url + "link_only": true |
The site refuses framing (e.g., Substack sends frame-ancestors 'self'). Renders a clean card with a title + description + "Open in new tab" button — no broken iframe. |
To find a section's id for option (b):
python3 -c "
from bs4 import BeautifulSoup
soup = BeautifulSoup(open('lectures/LectureN_<title>.html').read(), 'html.parser')
for h in soup.find_all(['h1','h2','h3']):
if h.get('id'):
n = h.find('span', class_='header-section-number')
print(f' id=\"{h.get(\"id\")}\" — §{n.get_text() if n else \"\"} {h.get_text()[:60]}')
"assets/media_resources.json is keyed by lecture key (the source L<N> prefix-style key,
e.g. "Lecture3" for Naive Bayes). Each entry can have:
{
"Lecture3": {
"slide": "https://docs.google.com/presentation/d/...", // remote slide URL (optional)
"slide_local": "assets/slides/Lecture3.pdf", // local PDF (optional, takes precedence)
"recordings": [
{ "label": "Spring 2026 — full recording", "url": "https://www.youtube.com/watch?v=..." }
]
}
}If a lecture key isn't present, the build falls back to assets/slides/<LectureKey>.pdf if
that file exists, otherwise the Slides button shows "No slides posted".
source/raw/FunML_LN_*/ ──┐
assets/notebooks/ ├──► buildsite.py ──► lectures/, assets/exercises/,
assets/media_resources.json │ assets/style.css,
assets/notebooks/embed_map.json│ assets/search-index.json,
index.html (template) │ index.html (sidebar updated in place),
│ assets/jupyterlite/
Lecture discovery and ordering (lecture_dir_sort_key in buildsite.py):
- Scan
source/raw/forFunML_L<N>_*/directories. - Skip anything matching
EXCLUDED_TEX_PATH_PARTIALS. - Sort by
(int(N), dir_name.lower())— within the sameN, alphabetical. - Each directory's
pick_preferred_lecture_texpicks the*_webpage_*.texfile (or the first non-trivial.texif none has "webpage" in its name). - Pandoc compiles tex → HTML.
- The
\lecture{N}{Title}{...}{...}macro provides the title; the sequential position in the sorted list is the display index used in filenames (LectureK_<slug>.html) and the sidebar number.
Notebook embeds (inject_interactive_notebooks):
- Pass 1: section-id rules → insert a demo block after each matching
<h1 id="X">. - Pass 2: image-match rules → replace each matching
<img>(or its enclosing<figure>) with a demo block.
Search index (build_search_index):
- After all lectures are built, walk each lecture HTML body.
- Group every
<p>and<li>under its preceding<h1>–<h4>(withid). - Truncate each section's text to 1500 chars.
- Write
assets/search-index.json(~1 MB for 32 lectures × 1094 sections). - The landing-page search input (
#global-search) loads this JSON once and does client-side substring matching, sorted lecture-wise.
Sidebar (sync_portal_index):
The build empties <div class="lecture-list"> in index.html and re-fills it
with one <a class="lecture-item"> per lecture. The sidebar caption ("X sessions")
is also rewritten. Everything else in index.html is preserved.
| Problem | Likely cause | Fix |
|---|---|---|
Two lectures with the same display title and Lecture(K)_X.html filename |
The same \lecture{N}{Title} is used in two source directories. |
Rename one in the tex source, or make the title unique. |
| "No notes" / blank lecture page | Source dir has no webpage tex AND no fallback tex matching the L-number. |
Add a L<N>_webpage_<topic>.tex or rename the existing tex to include webpage in its filename. |
| Lecture appears in unexpected order | Two directories share the same L<N>; alphabetical tiebreak applies. |
Rename one of them (e.g., add a leading a_ or z_) to control the order. |
| Lecture appears that you don't want | Not in EXCLUDED_TEX_PATH_PARTIALS. |
Add a substring of its directory name (lowercased) to the set in buildsite.py. |
| Demo doesn't show up | (a) notebook_html typo or file missing in assets/notebooks/, (b) image_match doesn't substring-match any <img src> in the HTML, (c) section_id doesn't match any <h1 id="...">. |
Inspect the rendered HTML; verify the rule fields. |
| Demo iframe is blank but link button works | The external site sends X-Frame-Options or restrictive Content-Security-Policy: frame-ancestors. |
Add "link_only": true to the rule so we render a card with just the link. |
| Search bar finds nothing | assets/search-index.json missing or stale. |
Re-run buildsite.py. |
| Search results in unexpected order | Sort is lecture-number ascending, then heading-hits over body-hits within a lecture. | Edit runSearch in script.js if you want different ordering. |
| Sidebar lecture-list looks wrong (old lectures) | index.html was edited but the build wasn't re-run. |
Re-run buildsite.py — the sidebar is generated, not authored. |
pandoc errors about missing .bib |
Lecture's \cite{...} calls have no references.bib next to the tex. |
Add a references.bib to the source dir, or remove the citations. |
- Don't commit
source/raw/or the zip. They're gitignored on purpose; the source of truth for content is upstream Overleaf. - Don't hand-edit anything in
lectures/. It gets overwritten on every build. If you need to fix a lecture, fix the tex source (in Overleaf, then re-zip) or — for site-only fixes — add the change to a post-build hook inbuildsite.pyso it survives subsequent builds. MODULE_LAYOUTfor sidebar grouping is not currently active inbuildsite.py, but a working version exists in the git reflog (git log --all --grep="module"). Cherry-pick if you want it back.- Backup before destructive operations —
source/raw/is local-only, so a wrongunzip -ocan clobber edits. Copy any locally-modified tex files to/tmp/before running zip extraction.