Skip to content

Updated equations of gradients and y_pred#3845

Open
im-shriram wants to merge 1 commit intopytorch:mainfrom
im-shriram:update_beginner_source
Open

Updated equations of gradients and y_pred#3845
im-shriram wants to merge 1 commit intopytorch:mainfrom
im-shriram:update_beginner_source

Conversation

@im-shriram
Copy link
Copy Markdown

Refactor Backpropagation Equations for Readability and Mathematical Clarity

Summary

This PR updates the manual backpropagation logic in polynomial_tensor.py to use dot products (torch.matmul) for gradient calculations, making the implementation more mathematically intuitive and readable.

Changes

  • Backpropagation: Replaced element-wise multiplication with torch.matmul for grad_b, grad_c, and grad_d. This better represents the dot product operation in the gradient equations.
  • Forward Pass: Minor refactoring of the y_pred equation for improved clarity.

Rationale

Using matmul for these gradients aligns the code closer to standard linear algebra notation, making the tutorial easier to interpret for users learning manual gradient computation.

… to actual dot product and updated the y_pred equation with parentheses for readability.
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Apr 29, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/3845

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla
Copy link
Copy Markdown

meta-cla Bot commented Apr 29, 2026

Hi @im-shriram!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

@meta-cla
Copy link
Copy Markdown

meta-cla Bot commented Apr 29, 2026

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

@meta-cla meta-cla Bot added the cla signed label Apr 29, 2026
@sekyondaMeta
Copy link
Copy Markdown
Contributor

@claude

@claude
Copy link
Copy Markdown

claude Bot commented Apr 29, 2026

Claude finished @sekyondaMeta's task in 1m 5s —— View job


PR Review: Updated equations of gradients and y_pred

  • Gather context and read the changed file
  • Analyze the mathematical correctness of the changes
  • Provide review feedback

Review Summary

This PR has correctness concerns and pedagogical trade-offs that should be addressed before merging.


Issues Found

1. Redundant .sum() after torch.matmul on 1D tensors (Bug/Misleading)

File: beginner_source/examples_tensor/polynomial_tensor.py:52-54

grad_b = torch.matmul(input=grad_y_pred, other=x).sum()
grad_c = torch.matmul(input=grad_y_pred, other=x.pow(2)).sum()
grad_d = torch.matmul(input=grad_y_pred, other=x.pow(3)).sum()

Both grad_y_pred and x are 1D tensors of shape (2000,). When torch.matmul is called with two 1D tensors, it computes the dot product and returns a scalar tensor. The subsequent .sum() is therefore a no-op — it sums a scalar, returning the same scalar.

This isn't technically wrong (it produces the correct result), but it's misleading. It suggests the author believed matmul returns a vector here, which it does not. For a beginner tutorial, this is confusing.

2. Pedagogical regression for a beginner tutorial

The original code:

grad_b = (grad_y_pred * x).sum()

This explicitly shows the two-step process: element-wise multiply, then sum — which is how you manually compute a dot product. For learners encountering gradient computation for the first time, this transparency is valuable. Replacing it with torch.matmul hides the mechanics behind an API call, which works against the tutorial's purpose of teaching manual backpropagation.

3. grad_y_pred rewrite is a no-op change

# Before:
grad_y_pred = 2.0 * (y_pred - y)
# After:
grad_y_pred = -2.0 * (y - y_pred)

These are algebraically identical: -2(y - ŷ) = 2(ŷ - y). This doesn't improve clarity or correctness — it just introduces a double negative that's arguably harder to read.

4. Unnecessary parentheses in y_pred

# Before:
y_pred = a + b * x + c * x ** 2 + d * x ** 3
# After:
y_pred = a + (b * x) + (c * x.pow(2)) + (d * x.pow(3))

Python's operator precedence already guarantees the correct evaluation order (** > * > +). The added parentheses are redundant. Replacing ** with .pow() is a style preference, not a clarity improvement.

5. Import reorder (positive)

Moving import math before import torch correctly follows PEP 8 (stdlib before third-party). This is the one unambiguously good change. ✓


Recommendation

I would not recommend merging this PR as-is. The changes don't fix a bug, and they make a beginner tutorial less transparent by hiding computation behind torch.matmul while introducing a misleading redundant .sum(). The tutorial intentionally uses manual, explicit operations to teach how gradient computation works under the hood.

If the goal is to use matmul for dot products, the .sum() calls should be removed. But even then, the pedagogical trade-off should be considered — explicit element-wise operations are arguably better for a beginner tutorial about manual backpropagation.


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants