Skip to content

zju3dv/EfficientLoFTR

Repository files navigation

Efficient LoFTR: Semi-Dense Local Feature Matching with Sparse-Like Speed


Efficient LoFTR: Semi-Dense Local Feature Matching with Sparse-Like Speed
Yifan Wang*, Xingyi He*, Sida Peng, Dongli Tan, Xiaowei Zhou
CVPR 2024 Highlight

realtime_demo.mp4

🌟News🌟

[2025-02] To enhance multi-modality matching with EfficientLoFTR and improve its applicability to UAV localization, autonomous driving, and beyond, check out our latest work, MatchAnything! Try our demo and see it in action!

[2025-07] EfficientLoFTR is now part of 🤗 Hugging Face Transformers (credit to sbucaille!). You can run inference with a few lines of code using pip install transformers. [model card]

TODO List

  • Inference code and pretrained models
  • Code for reproducing the test-set results
  • Add options of flash-attention for better performance
  • jupyter notebook demo for matching a pair of images
  • Training code

Installation

conda env create -f environment.yaml conda activate eloftr pip install torch==2.0.0+cu118 --index-url https://download.pytorch.org/whl/cu118 pip install -r requirements.txt 

The test and training can be downloaded by download link provided by LoFTR

We provide our pretrained model in download link

Match image pairs with EfficientLoFTR

[Basic Usage]
import torch import cv2 import numpy as np from copy import deepcopy from src.loftr import LoFTR, full_default_cfg, reparameter # Initialize the matcher with default settings _default_cfg = deepcopy(full_default_cfg) matcher = LoFTR(config=_default_cfg) # Load pretrained weights matcher.load_state_dict(torch.load("weights/eloftr_outdoor.ckpt")['state_dict']) matcher = reparameter(matcher) # Essential for good performance matcher = matcher.eval().cuda() # Load and preprocess images img0_raw = cv2.imread("path/to/image0.jpg", cv2.IMREAD_GRAYSCALE) img1_raw = cv2.imread("path/to/image1.jpg", cv2.IMREAD_GRAYSCALE) # Resize images to be divisible by 32 img0_raw = cv2.resize(img0_raw, (img0_raw.shape[1]//32*32, img0_raw.shape[0]//32*32)) img1_raw = cv2.resize(img1_raw, (img1_raw.shape[1]//32*32, img1_raw.shape[0]//32*32)) # Convert to tensors img0 = torch.from_numpy(img0_raw)[None][None].cuda() / 255. img1 = torch.from_numpy(img1_raw)[None][None].cuda() / 255. batch = {'image0': img0, 'image1': img1} # Inference with torch.no_grad(): matcher(batch) mkpts0 = batch['mkpts0_f'].cpu().numpy() # Matched keypoints in image0 mkpts1 = batch['mkpts1_f'].cpu().numpy() # Matched keypoints in image1 mconf = batch['mconf'].cpu().numpy() # Matching confidence scores
[Advanced Usage](with jupyter notebook)
import torch import cv2 import numpy as np import matplotlib.cm as cm from copy import deepcopy from src.loftr import LoFTR, full_default_cfg, opt_default_cfg, reparameter from src.utils.plotting import make_matching_figure # Model configuration options model_type = 'full' # Choose: 'full' for best quality, 'opt' for best efficiency precision = 'fp32' # Choose: 'fp32', 'mp' (mixed precision), 'fp16' for best efficiency # Load appropriate config if model_type == 'full': _default_cfg = deepcopy(full_default_cfg) elif model_type == 'opt': _default_cfg = deepcopy(opt_default_cfg) # Set precision options if precision == 'mp': _default_cfg['mp'] = True elif precision == 'fp16': _default_cfg['half'] = True # Initialize matcher matcher = LoFTR(config=_default_cfg) matcher.load_state_dict(torch.load("weights/eloftr_outdoor.ckpt")['state_dict']) matcher = reparameter(matcher) # Apply precision settings if precision == 'fp16': matcher = matcher.half() matcher = matcher.eval().cuda() # Load and preprocess images img0_raw = cv2.imread("path/to/image0.jpg", cv2.IMREAD_GRAYSCALE) img1_raw = cv2.imread("path/to/image1.jpg", cv2.IMREAD_GRAYSCALE) img0_raw = cv2.resize(img0_raw, (img0_raw.shape[1]//32*32, img0_raw.shape[0]//32*32)) img1_raw = cv2.resize(img1_raw, (img1_raw.shape[1]//32*32, img1_raw.shape[0]//32*32)) # Convert to tensors with appropriate precision if precision == 'fp16': img0 = torch.from_numpy(img0_raw)[None][None].half().cuda() / 255. img1 = torch.from_numpy(img1_raw)[None][None].half().cuda() / 255. else: img0 = torch.from_numpy(img0_raw)[None][None].cuda() / 255. img1 = torch.from_numpy(img1_raw)[None][None].cuda() / 255. batch = {'image0': img0, 'image1': img1} # Inference with different precision modes with torch.no_grad(): if precision == 'mp': with torch.autocast(enabled=True, device_type='cuda'): matcher(batch) else: matcher(batch) mkpts0 = batch['mkpts0_f'].cpu().numpy() mkpts1 = batch['mkpts1_f'].cpu().numpy() mconf = batch['mconf'].cpu().numpy() # Post-process confidence scores for 'opt' model if model_type == 'opt': mconf = (mconf - min(20.0, mconf.min())) / (max(30.0, mconf.max()) - min(20.0, mconf.min())) # Visualize matches color = cm.jet(mconf) text = ['EfficientLoFTR', 'Matches: {}'.format(len(mkpts0))] fig = make_matching_figure(img0_raw, img1_raw, mkpts0, mkpts1, color, text=text)

Configuration Options:

  • model_type:
    • 'full': Best matching quality
    • 'opt': Best efficiency with minimal quality loss
  • precision:
    • 'fp32': Full precision (default)
    • 'mp': Mixed precision for better efficiency
    • 'fp16': Half precision for maximum efficiency (requires modern GPU)
  • Note: Our model is trained on MegaDepth and works best for outdoor scenes. There may be a domain gap for indoor environments.
[Using Transformers] Note: The default AutoImageProcessor resizes images to a resolution of 480x640 pixels. If you need high-resolution matching, you should modify the default config or refer to basic/advanced usage.
from transformers import AutoImageProcessor, AutoModel import torch from PIL import Image import requests # Load example images (same as in the original paper) url_image1 = "https://raw.githubusercontent.com/magicleap/SuperGluePretrainedNetwork/refs/heads/master/assets/phototourism_sample_images/united_states_capitol_98169888_3347710852.jpg" image1 = Image.open(requests.get(url_image1, stream=True).raw) url_image2 = "https://raw.githubusercontent.com/magicleap/SuperGluePretrainedNetwork/refs/heads/master/assets/phototourism_sample_images/united_states_capitol_26757027_6717084061.jpg" image2 = Image.open(requests.get(url_image2, stream=True).raw) images = [image1, image2] # Load processor and model processor = AutoImageProcessor.from_pretrained("zju-community/efficientloftr") model = AutoModel.from_pretrained("zju-community/efficientloftr") # Process images and run inference inputs = processor(images, return_tensors="pt") with torch.no_grad(): outputs = model(**inputs) # keypoints = outputs.keypoints # Keypoints in both images # matches = outputs.matches # Matching indices  # matching_scores = outputs.matching_scores # Confidence scores

Post-process and visualize results:

# Post-process to get keypoints and matches in a readable format image_sizes = [[(image.height, image.width) for image in images]] outputs = processor.post_process_keypoint_matching(outputs, image_sizes, threshold=0.2) # Print matching results for i, output in enumerate(outputs): print(f"Image pair {i}:") print(f"Found {len(output['keypoints0'])} matches") for keypoint0, keypoint1, matching_score in zip( output["keypoints0"], output["keypoints1"], output["matching_scores"] ): print( f"Keypoint {keypoint0.numpy()}{keypoint1.numpy()} (score: {matching_score:.3f})" ) # Visualize matches processor.visualize_keypoint_matching(images, outputs)

For more details, visit the Hugging Face model card.

Reproduce the testing results

You need to first set up the testing subsets of ScanNet and MegaDepth. We create symlinks from the previously downloaded datasets to data/{{dataset}}/test.

# set up symlinks ln -s /path/to/scannet-1500-testset/* /path/to/EfficientLoFTR/data/scannet/test ln -s /path/to/megadepth-1500-testset/* /path/to/EfficientLoFTR/data/megadepth/test

Inference time

conda activate eloftr bash scripts/reproduce_test/indoor_full_time.sh bash scripts/reproduce_test/indoor_opt_time.sh

Accuracy

conda activate eloftr bash scripts/reproduce_test/outdoor_full_auc.sh bash scripts/reproduce_test/outdoor_opt_auc.sh bash scripts/reproduce_test/indoor_full_auc.sh bash scripts/reproduce_test/indoor_opt_auc.sh

Training

conda env create -f environment_training.yaml # used a different version of pytorch, maybe slightly different from the inference environment pip install -r requirements.txt conda activate eloftr_training bash scripts/reproduce_train/eloftr_outdoor.sh eloftr_outdoor

Citation

If you find this code useful for your research, please use the following BibTeX entry.

@inproceedings{wang2024eloftr, title={{Efficient LoFTR}: Semi-Dense Local Feature Matching with Sparse-Like Speed}, author={Wang, Yifan and He, Xingyi and Peng, Sida and Tan, Dongli and Zhou, Xiaowei}, booktitle={CVPR}, year={2024} }

About

Code for "Efficient LoFTR: Semi-Dense Local Feature Matching with Sparse-Like Speed", CVPR 2024

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published