Updated equations of gradients and y_pred by im-shriram · Pull Request #3845 · pytorch/tutorials

im-shriram · 2026-04-29T07:47:14Z

Refactor Backpropagation Equations for Readability and Mathematical Clarity

Summary

This PR updates the manual backpropagation logic in polynomial_tensor.py to use dot products (torch.matmul) for gradient calculations, making the implementation more mathematically intuitive and readable.

Changes

Backpropagation: Replaced element-wise multiplication with torch.matmul for grad_b, grad_c, and grad_d. This better represents the dot product operation in the gradient equations.
Forward Pass: Minor refactoring of the y_pred equation for improved clarity.

Rationale

Using matmul for these gradients aligns the code closer to standard linear algebra notation, making the tutorial easier to interpret for users learning manual gradient computation.

… to actual dot product and updated the y_pred equation with parentheses for readability.

pytorch-bot · 2026-04-29T07:47:18Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/3845

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Rolling out OSDC (ARC) runners on pull & trunk workflows in PyTorch main

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-cla · 2026-04-29T07:47:21Z

Hi @im-shriram!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

meta-cla · 2026-04-29T10:06:34Z

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

sekyondaMeta · 2026-04-29T16:22:43Z

@claude

claude · 2026-04-29T16:23:18Z

Claude finished @sekyondaMeta's task in 1m 5s —— View job

PR Review: Updated equations of gradients and y_pred

Gather context and read the changed file
Analyze the mathematical correctness of the changes
Provide review feedback

Review Summary

This PR has correctness concerns and pedagogical trade-offs that should be addressed before merging.

Issues Found

1. Redundant `.sum()` after `torch.matmul` on 1D tensors (Bug/Misleading)

File: beginner_source/examples_tensor/polynomial_tensor.py:52-54

grad_b = torch.matmul(input=grad_y_pred, other=x).sum()
grad_c = torch.matmul(input=grad_y_pred, other=x.pow(2)).sum()
grad_d = torch.matmul(input=grad_y_pred, other=x.pow(3)).sum()

Both grad_y_pred and x are 1D tensors of shape (2000,). When torch.matmul is called with two 1D tensors, it computes the dot product and returns a scalar tensor. The subsequent .sum() is therefore a no-op — it sums a scalar, returning the same scalar.

This isn't technically wrong (it produces the correct result), but it's misleading. It suggests the author believed matmul returns a vector here, which it does not. For a beginner tutorial, this is confusing.

2. Pedagogical regression for a beginner tutorial

The original code:

grad_b = (grad_y_pred * x).sum()

This explicitly shows the two-step process: element-wise multiply, then sum — which is how you manually compute a dot product. For learners encountering gradient computation for the first time, this transparency is valuable. Replacing it with torch.matmul hides the mechanics behind an API call, which works against the tutorial's purpose of teaching manual backpropagation.

3. `grad_y_pred` rewrite is a no-op change

# Before:
grad_y_pred = 2.0 * (y_pred - y)
# After:
grad_y_pred = -2.0 * (y - y_pred)

These are algebraically identical: -2(y - ŷ) = 2(ŷ - y). This doesn't improve clarity or correctness — it just introduces a double negative that's arguably harder to read.

4. Unnecessary parentheses in `y_pred`

# Before:
y_pred = a + b * x + c * x ** 2 + d * x ** 3
# After:
y_pred = a + (b * x) + (c * x.pow(2)) + (d * x.pow(3))

Python's operator precedence already guarantees the correct evaluation order (** > * > +). The added parentheses are redundant. Replacing ** with .pow() is a style preference, not a clarity improvement.

5. Import reorder (positive)

Moving import math before import torch correctly follows PEP 8 (stdlib before third-party). This is the one unambiguously good change. ✓

Recommendation

I would not recommend merging this PR as-is. The changes don't fix a bug, and they make a beginner tutorial less transparent by hiding computation behind torch.matmul while introducing a misleading redundant .sum(). The tutorial intentionally uses manual, explicit operations to teach how gradient computation works under the hood.

If the goal is to use matmul for dot products, the .sum() calls should be removed. But even then, the pedagogical trade-off should be considered — explicit element-wise operations are arguably better for a beginner tutorial about manual backpropagation.

feat: changed the gradients equation from element-wise multiplication…

088bc52

… to actual dot product and updated the y_pred equation with parentheses for readability.

meta-cla Bot added the cla signed label Apr 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updated equations of gradients and y_pred#3845

Updated equations of gradients and y_pred#3845
im-shriram wants to merge 1 commit intopytorch:mainfrom
im-shriram:update_beginner_source

im-shriram commented Apr 29, 2026

Uh oh!

pytorch-bot Bot commented Apr 29, 2026

Uh oh!

meta-cla Bot commented Apr 29, 2026

Uh oh!

meta-cla Bot commented Apr 29, 2026

Uh oh!

sekyondaMeta commented Apr 29, 2026

Uh oh!

claude Bot commented Apr 29, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

im-shriram commented Apr 29, 2026

Refactor Backpropagation Equations for Readability and Mathematical Clarity

Summary

Changes

Rationale

Uh oh!

pytorch-bot Bot commented Apr 29, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/3845

❗ 1 Active SEVs

Uh oh!

meta-cla Bot commented Apr 29, 2026

Action Required

Process

Uh oh!

meta-cla Bot commented Apr 29, 2026

Uh oh!

sekyondaMeta commented Apr 29, 2026

Uh oh!

claude Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Updated equations of gradients and y_pred

Review Summary

Issues Found

1. Redundant .sum() after torch.matmul on 1D tensors (Bug/Misleading)

2. Pedagogical regression for a beginner tutorial

3. grad_y_pred rewrite is a no-op change

4. Unnecessary parentheses in y_pred

5. Import reorder (positive)

Recommendation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

claude Bot commented Apr 29, 2026 •

edited

Loading

1. Redundant `.sum()` after `torch.matmul` on 1D tensors (Bug/Misleading)

3. `grad_y_pred` rewrite is a no-op change

4. Unnecessary parentheses in `y_pred`