Autograd
LibTorch records the computation graph automatically once a tensor requires
gradients — and that works across separate torch invocations, because the
tensors (and the graph) live in the daemon, not in your shell.
w=$(torch randn '[3]' --requires_grad)
loss=$(torch mul $w $w | torch sum)
torch backward $loss # gradients ACCUMULATE across calls
torch grad $w | torch value # a snapshot: later backwards won't change it
torch zero_grad $w # reset; grad now reads as zeros
d=$(torch detach $w) # graph-free reference (stops tracking)let w = (torch randn [3] --requires_grad)
let loss = (torch mul $w $w | torch sum)
torch backward $loss # gradients ACCUMULATE across calls
print (torch grad $w | torch value) # a snapshot
torch zero_grad $w # reset; grad now reads as zeros
let d = (torch detach $w) # graph-free (stops tracking)Rules of the road
These are PyTorch’s rules, surfaced with shell-friendly errors:
backwardneeds a scalar loss on a tensor that requires gradients — reduce first (sum,mean, or a loss op).gradbefore any backward is an error, not a silent zero.- Rebuild the graph before each backward. Re-run the pipeline that produces your loss each iteration; a second backward through the SAME graph errors, exactly as in PyTorch.
- Gradients accumulate across backward calls until you
zero_grad— again, exactly as in PyTorch.
Graph lifetime and handles
Freeing an intermediate’s handle is safe: the graph holds its tensors internally
until the graph itself dies. But keep your leaf handles — they are the only
key to their gradients. torch tensors counts only registry handles; graph-held
storage is invisible to it.
Losses are ordinary ops
torch mse_loss $pred $target | torch backwardtorch mse_loss $pred $target | torch backwardcross_entropy, l1_loss, binary_cross_entropy_with_logits, and friends are
all in the table — see torch ops under loss. For full training loops with
optimizers, see neural networks.