Scientific Memory

Buildable, machine-checkable scientific knowledge.

Quick start · Documentation · Repository status · Contributing

Overview

Scientific Memory turns mathematically structured science into machine-checkable, executable, composable artifacts with full provenance, organized as a knowledge-upgrading pipeline that supports durable scientific inheritance through traceable claims, formal layers, and executable witnesses.

You get	How
Traceable claims	Every claim anchored with `source_span` and schema-valid JSON
Formal layer	Lean 4 + mathlib, linked from the corpus via mapping and theorem cards
Executable witnesses	Kernels with explicit verification boundaries
Inspectable output	Portal renders exclusively from canonical manifests and exports
Reproducible gates	Unified validation, CI, benchmarks, and signed releases

Mission

The project optimizes for:

Pillar	Meaning
Explicit claims & assumptions	Every link between source text and formal code stays explicit in the corpus
Formal declarations	Machine-checked where the project commits to it
Executable kernels	Where numerical or computational alignment matters
Versioned provenance	Artifacts you can audit and rebuild
Reproducible builds	Lean, Python, and portal all part of one bar

What lives here

Area	Role
`corpus/`	Schema-first papers with metadata, claims, assumptions, symbols, and manifests
`formal/`	Lean 4 library (`ScientificMemory`) linked to the corpus
`schemas/`	Canonical JSON Schema for all public artifacts
`pipeline/`	`sm_pipeline` ingest, extract, validate (gate engine), publish, and portal export
`kernels/`	Executable kernels + shared `kernels/conformance/` test helpers
`portal/`	Next.js UI from `corpus-export.json` and corpus data
`benchmarks/`	Regression tasks, gold labels, thresholds, proof-success trends
PCS integration	Import proof-carrying releases, portal evidence UI, pcs-bench producer (docs/pcs/)

Quick start

git clone https://github.com/fraware/scientific-memory.git
cd scientific-memory

just bootstrap    # toolchains and dependencies
just build        # Lean + portal + Python tests
just validate     # full corpus / schema / graph gates
just portal       # local dev server (see terminal for URL)

Situation	Command
Something failed early	`just doctor` (uv, pnpm, Lean, Lake)
Lean only	`just lake-build` or `just lake-build-verbose LOG=lake-build.log`
Full pre-PR sweep	`just check`
Without `just` (or Git Bash missing)	Contributor playbook – Local CI

Canonical Local Workflow

Use this as the single local path before opening a PR:

just bootstrap
just check
just benchmark

If you cannot use just, run the equivalent uv/lake/pnpm commands from Contributor playbook – Local CI. On Windows, just uses Git for Windows Bash (see that section). That playbook section is also the canonical non-just path.

Repository status

Current tree (corpus, pipeline, CI, metrics) — click to expand

Corpus — Eight indexed papers in corpus/index.json (six formalized slices plus two hard-dimension stress scaffolds). Live counts and manifest hashes live in docs/status/repo-snapshot.md (just repo-snapshot).
Pipeline — Ingest, validate, publish, and portal export via gate_engine. Trust boundary details appear in docs/reference/trust-boundary-and-extraction.md.
PCS — Proof-carrying release import, portal claim pages, and pcs-bench producer (docs/pcs/README.md).
CI and releases — docs/infra/README.md and docs/maintainers.md; verify tagged releases with scripts/verify_release_checksums.sh.
Metrics and benchmarks — just metrics and just benchmark (docs/metrics.md, benchmarks/README.md).
Optional tooling — LLM proposals (suggest-only), MCP, Pandoc/LaTeX, Verso (docs/tooling/README.md); role playbooks (docs/playbooks/README.md).

Artifact flow

flowchart TD
  subgraph intake [Intake]
    Add[Add paper]
    Extract[Extract claims and context]
  end

  subgraph optional [Optional LLM assistance]
    LLM[LLM proposals]
    Review[Human review]
    Apply[Apply after review]
  end

  subgraph canonical [Canonical work]
    Norm[Normalize and link]
    Map[Map to Lean]
    Formal[Formalize in Lean]
  end

  subgraph validation [Validation publish]
    Gates[Gate engine validate all]
    Publish[Publish manifests and theorem cards]
    Export[Portal export]
  end

  subgraph outputs [Outputs]
    Portal[Portal pages]
    Bench[Benchmarks and regression]
    Manifests[Published artifacts]
  end

  Add --> Extract
  Extract --> Norm
  Norm --> Map
  Map --> Formal
  Norm --> Gates
  Formal --> Gates
  Gates --> Publish
  Publish --> Manifests
  Publish --> Export
  Export --> Portal
  Publish --> Bench
  Formal --> Bench

  Extract -.-> LLM
  LLM --> Review
  Review --> Apply
  Apply --> Norm
  Apply --> Map
  Apply --> Formal
  Apply --> Gates

Documentation

Topic	Link
Index	docs/README.md
Proof-Carrying Science (PCS)	docs/pcs/README.md
Role playbooks (formalizer, reviewer, domain expander, release manager)	docs/playbooks/README.md
Contributor playbook (setup, paper workflow, local CI, reuse, review, releases)	docs/contributor-playbook.md
Architecture	docs/architecture.md
Roadmap	ROADMAP.md
Paper intake (SPEC 8.1)	docs/paper-intake.md
Metrics (SPEC 12)	docs/metrics.md
ADRs	docs/adr/README.md
Infra / CI policy	docs/infra/README.md
Repo snapshot	docs/status/repo-snapshot.md (`just repo-snapshot`)
Maintainers (public push, CI, triage, launch)	docs/maintainers.md
MCP tooling (optional)	docs/tooling/mcp-lean-tooling.md
Prime Intellect LLM (optional, suggest-only)	docs/tooling/prime-intellect-llm.md
Trust boundary and manual E2E scenarios	docs/reference/trust-boundary-and-extraction.md · docs/testing/trust-hardening-e2e-scenarios.md · LLM Lean live test matrix
Pandoc / LaTeX (optional)	docs/tooling/pandoc-latex-integration.md

Contributing

Resource	Link
How to contribute	CONTRIBUTING.md
Step-by-step playbook	docs/contributor-playbook.md
Pipeline extension points	docs/pipeline-extension-points.md

Design principles

Artifact-first, model-second — durable JSON and Lean artifacts anchor every contribution.
Provenance is mandatory — claims and cards stay tied to sources.
Verification boundaries are explicit — proof, witness, and heuristic roles remain visible in manifests and portal views.
Claim bundles are the core unit — each formal result ships with claims, assumptions, and linked context.
Full buildability is the minimum bar — merges require the agreed validation and benchmark gates to pass.

License

Licensed under Apache-2.0 — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
.devcontainer		.devcontainer
.github		.github
benchmarks		benchmarks
blueprint		blueprint
corpus		corpus
docs		docs
examples		examples
formal		formal
kernels		kernels
pipeline		pipeline
portal		portal
release-run		release-run
schemas		schemas
scripts		scripts
tests		tests
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CITATION.cff		CITATION.cff
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
JUSTFILE		JUSTFILE
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
ROADMAP.md		ROADMAP.md
SECURITY.md		SECURITY.md
lake-manifest.json		lake-manifest.json
lakefile.toml		lakefile.toml
lean-toolchain		lean-toolchain
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
pyproject.toml		pyproject.toml
turbo.json		turbo.json
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scientific Memory

Overview

Mission

What lives here

Quick start

Canonical Local Workflow

Repository status

Artifact flow

Documentation

Contributing

Design principles

License

About

Uh oh!

Releases 2

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Scientific Memory

Overview

Mission

What lives here

Quick start

Canonical Local Workflow

Repository status

Artifact flow

Documentation

Contributing

Design principles

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Contributors

Uh oh!

Languages