- Notifications
You must be signed in to change notification settings - Fork 7.2k
Closed
Description
🐛 Describe the bug
from torchvision.transforms.functional import _get_perspective_coeffs A = torch.tensor(((1,1),(0,1)), dtype=torch.float32) offsets = torch.tensor([[1, 1], [-1,-1], [-1, 1], [1, -1]], dtype = torch.float32) start = 512+ offsets end = 512 + (A @ offsets.mT).mT _get_perspective_coeffs(start, end)Outputs something like:
tensor([ 9.6265e-01, -9.9404e-01, 5.1198e+02, -5.9690e-03, 9.4310e-01,
1.6095e+01, -1.1649e-05, -4.9741e-05])
Which is wrong, correct answer is:
tensor([ 1, -1, 512, 0, 1, 0, 0, 0])
Downstream issues, e.g. when using perspective:
from torchvision.transforms.functional import perspective import torch from PIL import Image H,W = 1024,1024 img = torch.ones(3,H,W) A = torch.tensor(((1,1),(0,1)), dtype=torch.float32) center = torch.tensor((H//2, W//2), dtype = torch.float32)[None] offsets = torch.tensor([[1, 1], [-1,-1], [-1, 1], [1, -1]], dtype = torch.float32) start = center + offsets end = center + (A @ offsets.mT).mT img_p = perspective(img, start, end, fill = (1, 0, 0)) Image.fromarray((255*img_p.permute(1,2,0)).to(torch.uint8).numpy()).save("img.jpg") # artifacts can be seen hereIssue comes from this line:
vision/torchvision/transforms/functional.py
Line 686 in 806dba6
| res = torch.linalg.lstsq(a_matrix, b_matrix, driver="gels").solution |
Replacing the float32 precision there with float64 should solve most problems. Note that this function probably should be remade, e.g. it does not currently give any error for inputting 3 points instead of 4.
Versions
torchvision 0.15.2
georg-bn
Metadata
Metadata
Assignees
Labels
No labels