.. _sphx_glr_beginner_examples_autograd_two_layer_net_custom_function.py:


PyTorch: Defining new autograd functions
----------------------------------------

A fully-connected ReLU network with one hidden layer and no biases, trained to
predict y from x by minimizing squared Euclidean distance.

This implementation computes the forward pass using operations on PyTorch
Variables, and uses PyTorch autograd to compute gradients.

In this implementation we implement our own custom autograd function to perform
the ReLU function.


.. code-block:: python

    import torch
    from torch.autograd import Variable


    class MyReLU(torch.autograd.Function):
        """
        We can implement our own custom autograd Functions by subclassing
        torch.autograd.Function and implementing the forward and backward passes
        which operate on Tensors.
        """

        def forward(self, input):
            """
            In the forward pass we receive a Tensor containing the input and return a
            Tensor containing the output. You can cache arbitrary Tensors for use in the
            backward pass using the save_for_backward method.
            """
            self.save_for_backward(input)
            return input.clamp(min=0)

        def backward(self, grad_output):
            """
            In the backward pass we receive a Tensor containing the gradient of the loss
            with respect to the output, and we need to compute the gradient of the loss
            with respect to the input.
            """
            input, = self.saved_tensors
            grad_input = grad_output.clone()
            grad_input[input < 0] = 0
            return grad_input


    dtype = torch.FloatTensor
    # dtype = torch.cuda.FloatTensor # Uncomment this to run on GPU

    # N is batch size; D_in is input dimension;
    # H is hidden dimension; D_out is output dimension.
    N, D_in, H, D_out = 64, 1000, 100, 10

    # Create random Tensors to hold input and outputs, and wrap them in Variables.
    x = Variable(torch.randn(N, D_in).type(dtype), requires_grad=False)
    y = Variable(torch.randn(N, D_out).type(dtype), requires_grad=False)

    # Create random Tensors for weights, and wrap them in Variables.
    w1 = Variable(torch.randn(D_in, H).type(dtype), requires_grad=True)
    w2 = Variable(torch.randn(H, D_out).type(dtype), requires_grad=True)

    learning_rate = 1e-6
    for t in range(500):
        # Construct an instance of our MyReLU class to use in our network
        relu = MyReLU()

        # Forward pass: compute predicted y using operations on Variables; we compute
        # ReLU using our custom autograd operation.
        y_pred = relu(x.mm(w1)).mm(w2)

        # Compute and print loss
        loss = (y_pred - y).pow(2).sum()
        print(t, loss.data[0])

        # Use autograd to compute the backward pass.
        loss.backward()

        # Update weights using gradient descent
        w1.data -= learning_rate * w1.grad.data
        w2.data -= learning_rate * w2.grad.data

        # Manually zero the gradients after updating weights
        w1.grad.data.zero_()
        w2.grad.data.zero_()

**Total running time of the script:** ( 0 minutes  0.000 seconds)


.. container:: sphx-glr-footer


  .. container:: sphx-glr-download

     :download:`Download Python source code: two_layer_net_custom_function.py <two_layer_net_custom_function.py>`


  .. container:: sphx-glr-download

     :download:`Download Jupyter notebook: two_layer_net_custom_function.ipynb <two_layer_net_custom_function.ipynb>`

.. rst-class:: sphx-glr-signature

    `Generated by Sphinx-Gallery <https://sphinx-gallery.readthedocs.io>`_