"Why was this built this way?" Every developer asks this. GraphEngine answers it.
Point GraphEngine at any GitHub repository. It reads through commits, pull requests, and discussions — and builds a queryable knowledge graph of every engineering decision ever made.
No cloning. Just answers.
Most codebases are graveyards of undocumented decisions. Why was this architecture chosen? Why did they switch libraries? Why does this weird workaround exist?
GraphEngine reconstructs the reasoning behind your codebase by mining the conversations that happened around it — and lets you query that reasoning in plain English.
# Analyze any public GitHub repo
npm run dev ingest facebook react
npm run dev normalize facebook react
npm run dev extract facebook react
# Ask questions about specific components
npm run dev why facebook react fiber
npm run dev why facebook react classComponent
npm run dev why facebook react concurrentMode| Query | Answer |
|---|---|
why was fiber introduced? |
Extracted from 3 PRs, 12 commits — with citations |
what changed in v18? |
Timeline of decisions leading to React 18 |
why is reconciliation designed this way? |
Grounded answer from actual PR discussions |
GitHub Repo URL
↓
Pull Requests + Commits + Issues (via GitHub API — no cloning needed)
↓
Normalization → Decision Extraction (LLM-powered)
↓
Knowledge Graph (relationships, timelines, decay modeling)
↓
Why-Query Engine → Natural language answers with citations
No raw source code is ever downloaded. GraphEngine works entirely over the GitHub API using just a token.
| Layer | What it does |
|---|---|
| Ingestion | Captures GitHub events as immutable records |
| Normalization | Unified event format across PRs, commits, issues |
| Decision Extraction | LLM identifies decisions from raw discussions |
| Knowledge Graph | Maps relationships between decisions and outcomes |
| Decision Timeline | Tracks how decisions evolved over time |
| Knowledge Freshness | Decay + reinforcement modeling for stale decisions |
| Why-Query Engine | Natural language queries over the decision graph |
| Explainability | Every answer is grounded with source citations |
Web interface (no install needed): 👉 https://why-graph-node.onrender.com
CLI:
# Clone and install
git clone https://github.com/rivv0/WhyGraph.git
cd WhyGraph
npm install
# Add your GitHub token and LLM Provider Key (Groq or OpenAI)
echo "GITHUB_TOKEN=your_token_here" > .env
echo "GROQ_API_KEY=your_key_here" >> .env
# Run
npm run dev # Show all CLI commands
npm run web # Start Web interface
npm run dev why <owner> <repo> <comp> # Query from terminal- GitHub discussions are unstructured and noisy — extracting actual decisions requires understanding context, not just keywords
- Decisions aren't atomic events — they evolve across dozens of PRs over months
- Answers need to be grounded — hallucinated explanations are worse than no explanation
- Knowledge goes stale — a decision made in 2019 may have been reversed in 2022
GraphEngine handles all of this.
Node.js SQLite JavaScript GitHub API LLM (decision extraction) Vector embeddings Canvas API
GraphEngine is what happens when you treat a GitHub repository not as code — but as a record of human decisions.