*Memos:
- My post explains Linear Regression in PyTorch.
- My post explains Batch, Mini-Batch and Stochastic Gradient Descent with DataLoader() in PyTorch.
- My post explains Batch Gradient Descent without
DataLoader()
in PyTorch. - My post explains how to save a model in PyTorch.
- My post explains how to load the saved model which I show in this post in PyTorch.
- My post explains Deep Learning Workflow in PyTorch.
- My post explains how to clone a private repository with FGPAT(Fine-Grained Personal Access Token) from Github.
- My post explains how to clone a private repository with PAT(Personal Access Token) from Github.
- My post explains useful IPython magic commands.
- My repo has models.
Module() can create a model, being its base class as shown below:
*Memos:
- forward() must be overridden in the subclass of
Module()
. - state_dict() can return the dictionary containing parameters and buffers. *It cannot get the
num3
andnum4
defined without Parameter(). - load_state_dict() can load model's state_dict() into a model. *Basically, it's used to load a saved model into the currently used model.
- parameters() can return an iterator over module parameters. *It cannot get the
num3
andnum4
defined withoutParameter()
. -
training
can check if it's train mode or eval mode. By default, it's train mode. - train() can set a model train mode.
- eval() can set a model eval mode.
- cpu() can convert all model parameters and buffers to CPU.
- cuda() can convert all model parameters and buffers to CUDA(GPU).
- There are also save() and load(). *My post explains
save()
andload()
.
import torch from torch import nn class MyModel(nn.Module): def __init__(self): super().__init__() self.num1 = nn.Parameter(torch.tensor(9.)) self.num2 = nn.Parameter(torch.tensor(7.)) self.num3 = torch.tensor(-2.) # Defined without `Parameter()` self.num4 = torch.tensor(6.) # Defined without `Parameter()` self.layer1 = nn.Linear(in_features=4, out_features=5) self.layer2 = nn.Linear(in_features=5, out_features=2) self.layer3 = nn.Linear(in_features=2, out_features=3) self.relu = nn.ReLU() def forward(self, x): # Must be overridden x1 = self.layer1(input=x) x2 = self.relu(input=x1) x3 = self.layer2(input=x2) x4 = self.relu(input=x3) x5 = self.layer3(input=x4) return x5 my_tensor = torch.tensor([8., -3., 0., 1.]) torch.manual_seed(42) mymodel = MyModel() mymodel(x=my_tensor) # tensor([0.8092, 0.8460, 0.3758], grad_fn=<ViewBackward0>) mymodel # MyModel( # (layer1): Linear(in_features=4, out_features=5, bias=True) # (layer2): Linear(in_features=5, out_features=2, bias=True) # (layer3): Linear(in_features=2, out_features=3, bias=True) # (relu): ReLU() # ) mymodel.layer2 # Linear(in_features=5, out_features=2, bias=True) mymodel.state_dict() # OrderedDict([('num1', tensor(9.)), # ('num2', tensor(7.)), # ('layer1.weight', # tensor([[0.3823, 0.4150, -0.1171, 0.4593], # [-0.1096, 0.1009, -0.2434, 0.2936], # [0.4408, -0.3668, 0.4346, 0.0936], # [0.3694, 0.0677, 0.2411, -0.0706], # [0.3854, 0.0739, -0.2334, 0.1274]])), # ('layer1.bias', # tensor([-0.2304, -0.0586, -0.2031, 0.3317, -0.3947])), # ('layer2.weight', # tensor([[-0.2062, -0.1263, -0.2689, 0.0422, -0.4417], # [0.4039, -0.3799, 0.3453, 0.0744, -0.1452]])), # ('layer2.bias', tensor([0.2764, 0.0697])), # ('layer3.weight', # tensor([[0.5713, 0.0773], # [-0.2230, 0.1900], # [-0.1918, 0.2976]])), # ('layer3.bias', tensor([0.6313, 0.4087, -0.3091]))]) mymodel.load_state_dict(state_dict=mymodel.state_dict()) # <All keys matched successfully> params = mymodel.parameters() next(params) # Parameter containing: # tensor(9., requires_grad=True) next(params) # Parameter containing: # tensor(7., requires_grad=True) next(params) # Parameter containing: # tensor([[0.3823, 0.4150, -0.1171, 0.4593], # [-0.1096, 0.1009, -0.2434, 0.2936], # [0.4408, -0.3668, 0.4346, 0.0936], # [0.3694, 0.0677, 0.2411, -0.0706], # [0.3854, 0.0739, -0.2334, 0.1274]], requires_grad=True) next(params) # Parameter containing: # tensor([-0.2304, -0.0586, -0.2031, 0.3317, -0.3947], # requires_grad=True) next(params) # Parameter containing: # tensor([[-0.2062, -0.1263, -0.2689, 0.0422, -0.4417], # [0.4039, -0.3799, 0.3453, 0.0744, -0.1452]], # requires_grad=True) next(params) # Parameter containing: # tensor([0.2764, 0.0697], requires_grad=True) next(params) # Parameter containing: # tensor([[0.5713, 0.0773], # [-0.2230, 0.1900], # [-0.1918, 0.2976]], requires_grad=True) next(params) # Parameter containing: # tensor([0.6313, 0.4087, -0.3091], requires_grad=True) mymodel.training # True mymodel.eval() mymodel.training # False mymodel.train() mymodel.training # True mymodel.cuda(device='cuda:0') mymodel.layer2.weight.device # device(type='cuda', index=0) mymodel.cpu() mymodel.layer2.weight.device # device(type='cpu')
Top comments (0)