Posted on Sep 18, 2024 • Edited on Nov 5, 2024

Embedding in PyTorch

#python #pytorch #embedding #embeddinglayer

*Memos:

My post explains Embedding Layer.
My post explains manual_seed().
My post explains requires_grad.

Embedding() can get the 1D or more D tensor of the zero or more elements computed by Embedding from the 0D or more D tensor of one or more elements(indices) with or without the 2D tensor of zero or more elements(weights) as shown below:

*Memos:

The 1st argument for initialization is num_embeddings(Required-Type:int). *It must be 1 <= x.
The 2nd argument for initialization is embedding_dim(Required-Type:int). *It must be 0 <= x.
The 3rd argument for initialization is padding_idx(Optional-Default:None-Type:int).
The 4th argument for initialization is max_norm(Optional-Default:None-Type:float).
The 5th argument for initialization is norm_type(Optional-Default:2.0-Type:float). *It must be 0 <= x when returning an empty tensor.
The 6th argument for initialization is scale_grad_by_freq(Optional-Default:False-Type:bool).
The 7th argument for initialization is sparse(Optional-Default:False-Type:bool).
The 8th argument for initialization is _weight(Optional-Default:None-Type:tensor of float): *Memos:
- If None, weight is randomly generated.
- It must be the 2D tensor of zero or more elements.
- Its size must be same as num_embeddings and embedding_dim.
The 9th argument for initialization is _freeze(Optional-Default:False-Type:bool). *If it's False, requires_grad is True while if it's True, requires_grad is False.
The 10th argument for initialization is device(Optional-Defalut:None-Type:str, int or device()): *Memos:
- If it's None, get_default_device() is used. *My post explains get_default_device() and set_default_device().
- device= can be omitted.
- My post explains device argument.
The 11th argument for initialization is dtype(Optional-Default:None-Type:dtype): *Memos:
- If it's None, get_default_dtype() is used. *My post explains get_default_dtype() and set_default_dtype().
- dtype= can be omitted.
- My post explains dtype argument.
The 1st argument is input(Required-Type:tensor of int): *Memos:
- It's indices.
- Indices must be less than num_embeddings.
- It must be the 0D or more D tensor of one or more elements.
- Its device must be same as Embedding()'s.
embedding.device and embedding.dtype don't work.

import torch from torch import nn tensor1 = torch.tensor([6, 0, 2, 5]) # Indices  tensor1.requires_grad # False  torch.manual_seed(42) embedding = nn.Embedding(num_embeddings=7, embedding_dim=3) tensor2 = embedding(input=tensor1) tensor2 # tensor([[0.8034, -0.6216, -0.5920], # [1.9269, 1.4873, 0.9007], # [0.8008, 1.6806, 0.3559], # [0.8599, -0.3097, -0.3957]], grad_fn=<EmbeddingBackward0>)  tensor2.requires_grad # True  embedding # Embedding(7, 3)  embedding.num_embeddings # 7  embedding.embedding_dim # 3  embedding.padding_idx # None  embedding.max_norm # None  embedding.norm_type # 2.0  embedding.scale_grad_by_freq # False  embedding.sparse # False  embedding.weight # Parameter containing: # tensor([[1.9269, 1.4873, 0.9007], # [-2.1055, 0.6784, 1.0783], # [0.8008, 1.6806, 0.3559], # [-0.6866, -0.4934, 0.2415], # [-1.1109, 0.0418, -0.2516], # [0.8599, -0.3097, -0.3957], # [0.8034, -0.6216, -0.5920]], requires_grad=True)  torch.manual_seed(42) embedding = nn.Embedding(num_embeddings=7, embedding_dim=3, padding_idx=None, max_norm=None, norm_type=2.0, scale_grad_by_freq=False, sparse=False, _weight=None, _freeze=False, device=None, dtype=None) embedding(input=tensor1) # tensor([[0.8034, -0.6216, -0.5920], # [1.9269, 1.4873, 0.9007], # [0.8008, 1.6806, 0.3559], # [0.8599, -0.3097, -0.3957]], grad_fn=<EmbeddingBackward0>)  weight = torch.tensor([[4., 9., 1.], [-2., 0., 3.], [0., 5., 7.], [8., -6., 0.], [1., 3., -9.], [-3., 1., 2.], [-5., 7., -4.]]) embedding = nn.Embedding(num_embeddings=7, embedding_dim=3, _weight=weight) embedding(input=tensor1) # tensor([[-5., 7., -4.], # [4., 9., 1.], # [0., 5., 7.], # [-3., 1., 2.]], grad_fn=<EmbeddingBackward0>)  my_tensor = torch.tensor([[6, 0], # Indices  [2, 5]]) torch.manual_seed(42) embedding = nn.Embedding(num_embeddings=7, embedding_dim=3) embedding(input=my_tensor) # tensor([[[0.8034, -0.6216, -0.5920], # [1.9269, 1.4873, 0.9007]], # [[0.8008, 1.6806, 0.3559], # [0.8599, -0.3097, -0.3957]]], grad_fn=<EmbeddingBackward0>)  my_tensor = torch.tensor([[[6], [0]], # Indices  [[2], [5]]]) torch.manual_seed(42) embedding = nn.Embedding() embedding(input=my_tensor) # tensor([[[[0.8034, -0.6216, -0.5920]], # [[1.9269, 1.4873, 0.9007]]], # [[[0.8008, 1.6806, 0.3559]], # [[0.8599, -0.3097, -0.3957]]]], grad_fn=<EmbeddingBackward0>)

Embedding.from_pretrained() can get the 1D or more D tensor of the zero or more elements computed by Embedding from the 0D or more D tensor of one or more elements(indices) with the 2D tensor of zero or more elements(weights) as shown below:

*Memos:

The 1st argument for initialization is embeddings(Required-Type:tensor of int, float or complex): *Memos:
- It's weight.
- It must be the 2D tensor of zero or more elements.
The 2nd argument for initialization is freeze(Optional-Default:True-Type:bool). *If it's False, requires_grad is True while if it's True, requires_grad is False.
The 3rd argument for initialization is padding_idx(Optional-Default:None-Type:int).
The 4th argument for initialization is max_norm(Optional-Default:None-Type:float). *It's must be None, if embeddings is an empty tensor and norm_type is negative.
The 5th argument for initialization is norm_type(Optional-Default:2.0-Type:float).
The 6th argument for initialization is scale_grad_by_freq(Optional-Default:False-Type:bool).
The 7th argument for initialization is sparse(Optional-Default:False-Type:bool).
The 1st argument is input(Required-Type:tensor of int): *Memos:
- It's indices.
- Indices must be less than embeddings of the number of the 1st dimension.
- It must be the 0D or more D tensor of one or more elements.

import torch from torch import nn weight = torch.tensor([[4., 9., 1.], [-2., 0., 3.], [0., 5., 7.], [8., -6., 0.], [1., 3., -9.], [-3., 1., 2.], [-5., 7., -4.]]) tensor1 = torch.tensor([6, 0, 2, 5]) # Indices  tensor1.requires_grad # False  embedding = nn.Embedding.from_pretrained(embeddings=weight) tensor2 = embedding(input=tensor1) tensor2 # tensor([[-5., 7., -4.], # [4., 9., 1.], # [0., 5., 7.], # [-3., 1., 2.]])  tensor2.requires_grad # False  embedding # Embedding(7, 3)  embedding.num_embeddings # 7  embedding.embedding_dim # 3  embedding.padding_idx # None  embedding.max_norm # None  embedding.norm_type # 2.0  embedding.scale_grad_by_freq # False  embedding.sparse # False  embedding.weight # Parameter containing: # tensor([[4., 9., 1.], # [-2., 0., 3.], # [0., 5., 7.], # [8., -6., 0.], # [1., 3., -9.], # [-3., 1., 2.], # [-5., 7., -4.]])  embedding = nn.Embedding.from_pretrained(embeddings=weight, freeze=True, padding_idx=None, max_norm=None, norm_type=2.0, scale_grad_by_freq=False, sparse=False) embedding(input=tensor1) # tensor([[-5., 7., -4.], # [4., 9., 1.], # [0., 5., 7.], # [-3., 1., 2.]])  my_tensor = torch.tensor([[6, 0], # Indices  [2, 5]]) embedding = nn.Embedding.from_pretrained(embeddings=weight) embedding(input=my_tensor) # tensor([[[-5., 7., -4.], # [4., 9., 1.]], # [[0., 5., 7.], # [-3., 1., 2.]]])  my_tensor = torch.tensor([[[6], [0]], # Indices  [[2], [5]]]) embedding = nn.Embedding.from_pretrained(embeddings=weight) embedding(input=my_tensor) # tensor([[[[-5., 7., -4.]], # [[4., 9., 1.]]], # [[[0., 5., 7.]], # [[-3., 1., 2.]]]])  weight = torch.tensor([[4, 9, 1], [-2, 0, 3], [0, 5, 7], [8, -6, 0], [1, 3, -9], [-3, 1, 2], [-5, 7, -4]]) embedding = nn.Embedding.from_pretrained(embeddings=weight) embedding(input=my_tensor) # tensor([[[[-5, 7, -4]], # [[4, 9, 1]]], # [[[0, 5, 7]], # [[-3, 1, 2]]]])  weight = torch.tensor([[4.+0.j, 9.+0.j, 1.+0.j], [-2.+0.j, 0.+0.j, 3.+0.j], [0.+0.j, 5.+0.j, 7.+0.j], [8.+0.j, -6.+0.j, 0.+0.j], [1.+0.j, 3.+0.j, -9.+0.j], [-3.+0.j, 1.+0.j, 2.+0.j], [-5.+0.j, 7.+0.j, -4.+0.j]]) embedding = nn.Embedding.from_pretrained(embeddings=weight) embedding(input=my_tensor) # tensor([[[[-5.+0.j, 7.+0.j, -4.+0.j]], # [[4.+0.j, 9.+0.j, 1.+0.j]]], # [[[0.+0.j, 5.+0.j, 7.+0.j]], # [[-3.+0.j, 1.+0.j, 2.+0.j]]]])  weight = torch.tensor([[True, False, True], [False, True, False], [True, False, True], [False, True, False], [True, False, True], [False, True, False], [True, False, True]]) embedding = nn.Embedding.from_pretrained(embeddings=weight) embedding(input=my_tensor) # tensor([[[[True, False, True]], # [[True, False, True]]], # [[[True, False, True]], # [[False, True, False]]]])

DEV Community

Embedding in PyTorch

Top comments (0)