As suggested by reviewer 3, we use sift to match the feature points of images of the same scene captured from different viewpoints.
The images are from the office1, campus, and office lobby videos without further compression. For those images from the same dataset, sift features and descriptors are first extracted. We then construct the feature correspondence among each pair of images through sift descriptor matching. The matching results are as follows.
Sift matching for Dataset_Office1
Sift matching for Dataset_campus
Sift matching for Dataset_office lobby