autograd中的几个重要概念

autograd中的几个重要概念。

1.叶子张量

在Pytorch中，默认情况下，非叶节点的梯度值在反向传播过程中使用完后就会被清除，不会被保留。只有叶节点的梯度值能够被保留下来。对于任意一个张量来说，我们可以用 tensor.is_leaf 来判断它是否是叶子张量（leaf tensor），换句话说，只有叶子张量才可以计算梯度。

实例：

import torch

x = torch.tensor(2.0, requires_grad=True)
print(x)
y = x ** 2
z = y + 1
z.backward()
print(x.grad) #就是计算dz/dx
print(y.grad)
print(x.is_leaf)
print(y.is_leaf)
print(x.grad_fn)
print(y.grad_fn)

运行结果：

tensor(2., requires_grad=True)
tensor(4.)
None
True
False
None
<PowBackward0 object at 0x000002260759FE80>

2.torch.autograd.Function

虽然pytorch可以自动求导，但是有时候我们不想要它自带的求导机制，需要在它的基础之上做些扩展，或者有时候一些操作是不可导的，这时候你需要自定义求导方式。也就是所谓的 "Extending torch.autograd"。

假设你现在想自定义一个操作，那么就按顺序做下面几件事就好：

首先你要让它继承这个class：torch.autograd.Function
实现forward()/backward()这2个函数

实例：

import torch

class line(torch.autograd.Function):
    @staticmethod
    #定义前向运算
    def forward(ctx, w, x, b):
        #y = w*x +b
        ctx.save_for_backward(w, x, b)
        return w * x + b

    @staticmethod
    #定义反向传播
    def backward(ctx, grad_out):
        w, x, b = ctx.saved_tensors

        grad_w = grad_out * x
        grad_x = grad_out * w
        grad_b = grad_out

        return grad_w, grad_x, grad_b


w = torch.rand(2, 2, requires_grad=True)
x = torch.rand(2, 2, requires_grad=True)
b = torch.rand(2, 2, requires_grad=True)

out = line.apply(w, x, b)
out.backward(torch.ones(2, 2))

print(w, x, b)
#w的导数其实是x,x的导数是w,b的导数是1
print(w.grad, x.grad, b.grad)

运行结果：

tensor([[0.1902, 0.1831],
        [0.5634, 0.4566]], requires_grad=True) tensor([[0.3844, 0.0606],
        [0.0250, 0.3822]], requires_grad=True) tensor([[0.7814, 0.3440],
        [0.6387, 0.8746]], requires_grad=True)
tensor([[0.3844, 0.0606],
        [0.0250, 0.3822]]) tensor([[0.1902, 0.1831],
        [0.5634, 0.4566]]) tensor([[1., 1.],
        [1., 1.]])