CS498: Computer Vision

CS498

Lab 5

This is an OLD version of Lab5 for the Winter 2014-2015 offering of CS498 Computer Vision. The new one is available here.

Deliverables

Zip code of your source code for this lab. This should contain a zipped folder which contains the m-files directly inside it. Please do not submit deeply-nested source files.
findAndExtractDescriptors
findMatches.m
findInliers.m
ransac.m
sift.m - your top-level script.
The first and second image with the inlier points plotted on it.
A composite image created from at least two images combined together.
Answers to these questions
- Why are at least four points required to compute a homography?

Overview

In this lab, we get to put those features from last week to work! To do this we will...

Compute the interest points and features for two images
For each feature point in the first image, find the closest and 2nd closest feature in the second image
If the ratio between closest & 2nd closest passes a test, we will consider the closest feature to be a match.
Some of these matches will still be false. To further refine the matches, we will apply RANSAC to all the matches to find a consistent subset
Finally, we will transform one image to line up with the other automatically. At the end of this lab, we will have a program that does Lab2 completely automatically!

Finding keypoints and descriptors in two images

To faciliate finding keypoints and descriptors in multiple images, wrap up all of the methods from the previous section into a single methd, following this template:

% results = findAndExtractDescriptors(I)
%
%  Finds SIFT interest points and extracts sift descriptors
%  for each point.
%
%  I - the input image
%
%  results
%     results.points -- the extracted image points
%     results.i -- i index of the point in the original image
%     results.j -- j index of the point in the original image
%     results.sigma -- scale of the feature point in cooridnates of the
%                      the original iamge
%     results.descriptor -- 128-element vector containing the SIFT
%                           descriptor

% Author: Phileas Fogg
% Term: Winter 2012-2013
% Course: CS498 - Computer Vision
% Assignment: Lab 5 - SIFT feature matching

function results = findAndExtractDescriptors(I)

Finding matches between two images.

Run your extractDescriptors method from the previous lab on two images to get two sets of SIFT descriptors.

To compute the distance between two SIFT feature descriptors, simply use the euclidean distance:

d(f1,f2) = sqrt( (f1_{1_{- f2₁)² + (f1₂ - f2₂)² + ... + (f1₁₂₈ + f2₁₂₈))}}

Where fi stands for feature-descriptor i and fi_j stands for the j^th compent of the i^th feature. Of course, you can compute this without any loops in a single line!

To decide if a point has a match, find the two points closest to it in the other image. If both points are about the same distance from the original point, then it is likely neither one is a match. But if the second point is significantly farther than the first, the first may be a valid match. Suppose f1 is the feature, f2 is the closest feature, and f3 is the second-closest feature. Let d1 = d(f1,f2) be the distance to the nearest neighbor, and d2 = d(f1,f3) be the distance to the second-nearest neighbor. Then accept f2 as a match for f1 if d1/d2 = d(f1,f2) / d(f1,f3) < 0.8.

Using the template below, write a method to find all the matches in an image.

% function matches = findMatches(results1,results2)
% 
% Find the feature matches between results1 and results2
% 
% matches - an Nx4 array where each row holds the indices of the points in the matches, and N is the number of matches found
%          [i1 j1 i2 j2] 
%             

% Author: Phileas Fogg
% Term: Winter 2012-2013
% Course: CS498 - Computer Vision
% Assignment: Lab 5 - SIFT feature matching 

function matches = findMatches(results1, results2)

You can draw the resulting matches using the function drawMatches. If you would like to display the matches as you find them, you can use drawMatches(results1,results2) to draw the stacked images with the fature points, and drawMatch(match, [size(results1.I,1), 0], 'g-') to draw each match as it is found.

RANSAC

Implement the RANSAC algorithm, as we discussed in class, to find a subset of matches that results in a high number of inliers. You do not need to compute the residuals/reprojection error. The RANSAC (RANdom Sampling And Consensus) algorithm is described in Szelinski in section 6.1.4 (pp. 281-282/). Using the notation of this section, I determined the number of runs using P=0.99, p=0.3, and k = 4.

Follow this template:

% [indChosen, inliers] = ransac(matches)
%
% Run the RANSAC algorithm, a robust technique for finding a consistent
% subset of matches among an often-faulty collection.
%
% Inputs
%   matches -- a matrix with [i1 j1 i2 j2] on each row representing a match
% Outputs
%   indChosen -- vector of indexes to the four chosen rows of matches
%   inliers -- boolean vector; true entries correspond to inlier rows
%              These are rows consistent with the transfomr found from the 
%              four indChosen rows

% Author: Phileas Fogg
% Term: Winter 2012-2013
% Course: CS498 - Computer Vision
% Assignment: Lab 5 - SIFT feature matching

function [indChosen, inliers] = ransac(matches)

Within RANSAC, create this method to find the inliers. You can use your findTransform method from Lab 2 to find the transform based on the point matches.

% inliers = findInliers(matches, indChosen)
%
% matches -- list of matches in rows of [i1 j1 i2 j2]
% indChosen -- indices for chosen matches -- either integer list or logical array matching
%   height of matches
%
% inliers -- logical list, true for inlier rows of matches

% Author: Phileas Fogg
% Term: Winter 2012-2013
% Course: CS498 - Computer Vision
% Assignment: Lab 5 - SIFT feature matching

function inliers = findInliers(matches, indChosen)

Once you have found the inliers with RANSAC, use all the inliers to compute a more accurate correspondence between the images. Then use transformImage to transform the second image back onto the first.

Create an image with the matches, and a stiched image like the ones below, and submit them with your results!

Fun Ideas

Test your tranform under a variety of conditions... does it work if you turn one image upside-down? Does it work if you scale an image by a factor of 2? How about for a factor of 1.5?
Although we have developed this algorithm to work on two images using a homography mapping between them, it can find matches between images with some projective distortion -- as long as there are enough co-planar features in the image. Try running the algorithm on two images taken from slightly different perspectives.
Using epipolar geometry, you can find a mapping between any two images that constrains the images to be along a line. You could implement this mapping, and use the RANSAC approach to find an epipolar match between two images. This would allow you to find more corresponding points between any images of 3D scenes, but it might be harder to filter the false-positives, because any two points along the epipolar line would be allowed to match, even if they are false matches. It also may take longer to run RANSAC, because you will need to find more than 4 point correspondences to compute the initial estimte of the epipolar constraint.
The book mentions a variety of alternative features -- e.g. circular features that are said to slightly out-perform SIFT. You could also compute these... and test to see which is better.
Use Adaptive non-maximal suppression (ANMS) to find interest-points in lower-contrast regions of the image (e.g. the grass in the images above). See the text, Section 4.1.1, Figure 4.9, pp. 191/
Can you reproduce a plot something like Figure 11 in Lowe 2004? It wont' look as clean if it is based on just matching two images, but does it show the same general trend? (You could consider inliers from your final results to be "true positives" and the rest to be "false positives", or you could apply a known transformation to an image as Lowe and 2004. Then you know how the points should transform from one image to another, and don't need to use RANSAC.)
Solving the over-constrained set of linear equations does not find the best transform. Once you have the transform, refine it using a gradient descent method.
Try creating a chain of images, by finding the homography between each image and the next, then mapping all of the images back into the plane of the original image.
Now is the time where you can REALLY go to town with your Google street-view approach that you may have developed in Lab 2! Or you could try to convince Google to map our campus! (I think you would need to contact HR first...)
Do you have your own fun ideas that you would like to try?

References

David Lowe's SIFT paper, available on his SIFT page
Richard Szeliski's Computer Vision: Algorithms and Applications, (available on his Book page, and from booksellers everywhere...)
- Bilinear interpolation when computing histograms pp. 96-97/110
- 4.1 Points and Patches (includes High-level description of algorithm) pp. 192-194/216-219, 197-199/
- RANSAC 6.1.4 (p.281-282/)

Lab 5: Image alignment using SIFT feature matching