在控制流的例子中,如果我們將變量a更改為隨機(jī)向量或矩陣,會(huì)發(fā)生什么
如果將a改成一個(gè)3行1列的隨機(jī)數(shù)矩陣
程序運(yùn)行結(jié)果如下:
torch.Size([3, 1])
tensor([[0.2728], ? ? ? ?[1.0503], ? ? ? ?[0.5629]], requires_grad=True)?
tensor(2.4448, grad_fn=<CopyBackwards>)?
tensor(4.8897, grad_fn=<CopyBackwards>)?
tensor(9.7794, grad_fn=<CopyBackwards>)?
tensor(19.5587, grad_fn=<CopyBackwards>)
tensor(39.1175, grad_fn=<CopyBackwards>)?
tensor(78.2349, grad_fn=<CopyBackwards>)?
tensor(156.4698, grad_fn=<CopyBackwards>)?
tensor(312.9396, grad_fn=<CopyBackwards>)?
tensor(625.8792, grad_fn=<CopyBackwards>)
C==b?
tensor([[ 279.3541], ? ? ? ?[1075.4806], ? ? ? ?[ 576.3696]], grad_fn=<MulBackward0>)
RuntimeError: grad can be implicitly created only for scalar outputs
報(bào)運(yùn)行時(shí)錯(cuò)誤:只有對(duì)標(biāo)量輸出它才會(huì)計(jì)算梯度
但顯然這里a是一個(gè)矩陣,當(dāng)調(diào)用向量的反向計(jì)算時(shí),我們通常會(huì)試圖計(jì)算一批訓(xùn)練樣本中每個(gè)組成部分的損失函數(shù)的導(dǎo)數(shù)。這里,我們的目的不是計(jì)算微分矩陣,而是批量中每個(gè)樣本單獨(dú)計(jì)算的偏導(dǎo)數(shù)之和。
輸出結(jié)果如下:
torch.Size([3, 1])
tensor([[-3.9814e-04], ? ? ? ?[ 1.5525e+00], ? ? ? ?[-8.2370e-01]], requires_grad=True) tensor(3.5150, grad_fn=<CopyBackwards>)
tensor(7.0300, grad_fn=<CopyBackwards>)?
tensor(14.0601, grad_fn=<CopyBackwards>)?
tensor(28.1202, grad_fn=<CopyBackwards>)?
tensor(56.2403, grad_fn=<CopyBackwards>)
tensor(112.4806, grad_fn=<CopyBackwards>)
tensor(224.9612, grad_fn=<CopyBackwards>)?
tensor(449.9225, grad_fn=<CopyBackwards>)
tensor(899.8450, grad_fn=<CopyBackwards>)
C==b?
tensor([[-4.0770e-01], ? ? ? ?[ 1.5898e+03], ? ? ? ?[-8.4347e+02]], grad_fn=<MulBackward0>)
tensor([[1024.], ? ? ? ?[1024.], ? ? ? ?[1024.]])
----end----