Efficient LoFTR: Semi-Dense Local Feature Matching with Sparse-Like Speed

Project Page | Paper

Efficient LoFTR: Semi-Dense Local Feature Matching with Sparse-Like Speed
Yifan Wang^*, Xingyi He^*, Sida Peng, Dongli Tan, Xiaowei Zhou
CVPR 2024 Highlight

realtime_demo.mp4

🌟News🌟

[2025-02] To enhance multi-modality matching with EfficientLoFTR and improve its applicability to UAV localization, autonomous driving, and beyond, check out our latest work, MatchAnything! Try our demo and see it in action!

[2025-07] EfficientLoFTR is now part of 🤗 Hugging Face Transformers (credit to sbucaille!). You can run inference with a few lines of code using pip install transformers. [model card]

TODO List

Inference code and pretrained models
Code for reproducing the test-set results
Add options of flash-attention for better performance
jupyter notebook demo for matching a pair of images
Training code

Installation

conda env create -f environment.yaml conda activate eloftr pip install torch==2.0.0+cu118 --index-url https://download.pytorch.org/whl/cu118 pip install -r requirements.txt

The test and training can be downloaded by download link provided by LoFTR

We provide our pretrained model in download link

Match image pairs with EfficientLoFTR

[Basic Usage]

import torch import cv2 import numpy as np from copy import deepcopy from src.loftr import LoFTR, full_default_cfg, reparameter # Initialize the matcher with default settings _default_cfg = deepcopy(full_default_cfg) matcher = LoFTR(config=_default_cfg) # Load pretrained weights matcher.load_state_dict(torch.load("weights/eloftr_outdoor.ckpt")['state_dict']) matcher = reparameter(matcher) # Essential for good performance matcher = matcher.eval().cuda() # Load and preprocess images img0_raw = cv2.imread("path/to/image0.jpg", cv2.IMREAD_GRAYSCALE) img1_raw = cv2.imread("path/to/image1.jpg", cv2.IMREAD_GRAYSCALE) # Resize images to be divisible by 32 img0_raw = cv2.resize(img0_raw, (img0_raw.shape[1]//32*32, img0_raw.shape[0]//32*32)) img1_raw = cv2.resize(img1_raw, (img1_raw.shape[1]//32*32, img1_raw.shape[0]//32*32)) # Convert to tensors img0 = torch.from_numpy(img0_raw)[None][None].cuda() / 255. img1 = torch.from_numpy(img1_raw)[None][None].cuda() / 255. batch = {'image0': img0, 'image1': img1} # Inference with torch.no_grad(): matcher(batch) mkpts0 = batch['mkpts0_f'].cpu().numpy() # Matched keypoints in image0 mkpts1 = batch['mkpts1_f'].cpu().numpy() # Matched keypoints in image1 mconf = batch['mconf'].cpu().numpy() # Matching confidence scores

[Advanced Usage](with jupyter notebook)

import torch import cv2 import numpy as np import matplotlib.cm as cm from copy import deepcopy from src.loftr import LoFTR, full_default_cfg, opt_default_cfg, reparameter from src.utils.plotting import make_matching_figure # Model configuration options model_type = 'full' # Choose: 'full' for best quality, 'opt' for best efficiency precision = 'fp32' # Choose: 'fp32', 'mp' (mixed precision), 'fp16' for best efficiency # Load appropriate config if model_type == 'full': _default_cfg = deepcopy(full_default_cfg) elif model_type == 'opt': _default_cfg = deepcopy(opt_default_cfg) # Set precision options if precision == 'mp': _default_cfg['mp'] = True elif precision == 'fp16': _default_cfg['half'] = True # Initialize matcher matcher = LoFTR(config=_default_cfg) matcher.load_state_dict(torch.load("weights/eloftr_outdoor.ckpt")['state_dict']) matcher = reparameter(matcher) # Apply precision settings if precision == 'fp16': matcher = matcher.half() matcher = matcher.eval().cuda() # Load and preprocess images img0_raw = cv2.imread("path/to/image0.jpg", cv2.IMREAD_GRAYSCALE) img1_raw = cv2.imread("path/to/image1.jpg", cv2.IMREAD_GRAYSCALE) img0_raw = cv2.resize(img0_raw, (img0_raw.shape[1]//32*32, img0_raw.shape[0]//32*32)) img1_raw = cv2.resize(img1_raw, (img1_raw.shape[1]//32*32, img1_raw.shape[0]//32*32)) # Convert to tensors with appropriate precision if precision == 'fp16': img0 = torch.from_numpy(img0_raw)[None][None].half().cuda() / 255. img1 = torch.from_numpy(img1_raw)[None][None].half().cuda() / 255. else: img0 = torch.from_numpy(img0_raw)[None][None].cuda() / 255. img1 = torch.from_numpy(img1_raw)[None][None].cuda() / 255. batch = {'image0': img0, 'image1': img1} # Inference with different precision modes with torch.no_grad(): if precision == 'mp': with torch.autocast(enabled=True, device_type='cuda'): matcher(batch) else: matcher(batch) mkpts0 = batch['mkpts0_f'].cpu().numpy() mkpts1 = batch['mkpts1_f'].cpu().numpy() mconf = batch['mconf'].cpu().numpy() # Post-process confidence scores for 'opt' model if model_type == 'opt': mconf = (mconf - min(20.0, mconf.min())) / (max(30.0, mconf.max()) - min(20.0, mconf.min())) # Visualize matches color = cm.jet(mconf) text = ['EfficientLoFTR', 'Matches: {}'.format(len(mkpts0))] fig = make_matching_figure(img0_raw, img1_raw, mkpts0, mkpts1, color, text=text)

Configuration Options:

model_type:
- 'full': Best matching quality
- 'opt': Best efficiency with minimal quality loss
precision:
- 'fp32': Full precision (default)
- 'mp': Mixed precision for better efficiency
- 'fp16': Half precision for maximum efficiency (requires modern GPU)
Note: Our model is trained on MegaDepth and works best for outdoor scenes. There may be a domain gap for indoor environments.

[Using Transformers]

Note: The default AutoImageProcessor resizes images to a resolution of 480x640 pixels. If you need high-resolution matching, you should modify the default config or refer to basic/advanced usage.

from transformers import AutoImageProcessor, AutoModel import torch from PIL import Image import requests # Load example images (same as in the original paper) url_image1 = "https://raw.githubusercontent.com/magicleap/SuperGluePretrainedNetwork/refs/heads/master/assets/phototourism_sample_images/united_states_capitol_98169888_3347710852.jpg" image1 = Image.open(requests.get(url_image1, stream=True).raw) url_image2 = "https://raw.githubusercontent.com/magicleap/SuperGluePretrainedNetwork/refs/heads/master/assets/phototourism_sample_images/united_states_capitol_26757027_6717084061.jpg" image2 = Image.open(requests.get(url_image2, stream=True).raw) images = [image1, image2] # Load processor and model processor = AutoImageProcessor.from_pretrained("zju-community/efficientloftr") model = AutoModel.from_pretrained("zju-community/efficientloftr") # Process images and run inference inputs = processor(images, return_tensors="pt") with torch.no_grad(): outputs = model(**inputs) # keypoints = outputs.keypoints # Keypoints in both images # matches = outputs.matches # Matching indices  # matching_scores = outputs.matching_scores # Confidence scores

Post-process and visualize results:

# Post-process to get keypoints and matches in a readable format image_sizes = [[(image.height, image.width) for image in images]] outputs = processor.post_process_keypoint_matching(outputs, image_sizes, threshold=0.2) # Print matching results for i, output in enumerate(outputs): print(f"Image pair {i}:") print(f"Found {len(output['keypoints0'])} matches") for keypoint0, keypoint1, matching_score in zip( output["keypoints0"], output["keypoints1"], output["matching_scores"] ): print( f"Keypoint {keypoint0.numpy()} ↔ {keypoint1.numpy()} (score: {matching_score:.3f})" ) # Visualize matches processor.visualize_keypoint_matching(images, outputs)

For more details, visit the Hugging Face model card.

Reproduce the testing results

You need to first set up the testing subsets of ScanNet and MegaDepth. We create symlinks from the previously downloaded datasets to data/{{dataset}}/test.

# set up symlinks ln -s /path/to/scannet-1500-testset/* /path/to/EfficientLoFTR/data/scannet/test ln -s /path/to/megadepth-1500-testset/* /path/to/EfficientLoFTR/data/megadepth/test

Inference time

conda activate eloftr bash scripts/reproduce_test/indoor_full_time.sh bash scripts/reproduce_test/indoor_opt_time.sh

Accuracy

conda activate eloftr bash scripts/reproduce_test/outdoor_full_auc.sh bash scripts/reproduce_test/outdoor_opt_auc.sh bash scripts/reproduce_test/indoor_full_auc.sh bash scripts/reproduce_test/indoor_opt_auc.sh

Training

conda env create -f environment_training.yaml # used a different version of pytorch, maybe slightly different from the inference environment pip install -r requirements.txt conda activate eloftr_training bash scripts/reproduce_train/eloftr_outdoor.sh eloftr_outdoor

Citation

If you find this code useful for your research, please use the following BibTeX entry.

@inproceedings{wang2024eloftr, title={{Efficient LoFTR}: Semi-Dense Local Feature Matching with Sparse-Like Speed}, author={Wang, Yifan and He, Xingyi and Peng, Sida and Tan, Dongli and Zhou, Xiaowei}, booktitle={CVPR}, year={2024} }

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
assets		assets
configs		configs
data		data
notebooks		notebooks
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yaml		environment.yaml
environment_training.yaml		environment_training.yaml
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Efficient LoFTR: Semi-Dense Local Feature Matching with Sparse-Like Speed

Project Page | Paper

🌟News🌟

TODO List

Installation

Match image pairs with EfficientLoFTR

Reproduce the testing results

Inference time

Accuracy

Training

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

License

zju3dv/EfficientLoFTR

Folders and files

Latest commit

History

Repository files navigation

Efficient LoFTR: Semi-Dense Local Feature Matching with Sparse-Like Speed

Project Page | Paper

🌟News🌟

TODO List

Installation

Match image pairs with EfficientLoFTR

Reproduce the testing results

Inference time

Accuracy

Training

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages