Sunday 5 July 2015

Taking on Imatest


After having worked on MTF Mapper for almost five years now, I have decided that it is time to go head-to-head with Imatest. I downloaded a trial version of Imatest 4.1.12 to face off against MTF Mapper 0.4.18.

For the purpose of this comparison I decided to generate synthetic images using mtf_generate_rectangle. This allows me to use a set of images rendered using an accurately known PSF, meaning that we know exactly what the actual MTF50 value should be for those images. I decided to render a test chart conforming to the SFRPlus format, since that allows me to extract a fair number of edges for each test case. The approximately-sfrplus-chart looks like this:
Figure 1: SFRPlus style chart with an MTF50 value of 0.35 cycles/pixel
 SFRPlus was quite happy to automatically identify and extract regions of interest (ROIs) over all the relevant edges from this image. MTF Mapper can also extract edges from this image automatically. One notable difference is that SFRPlus includes the edges of the squares that overlap with the black bars at the top and bottom of the images, whereas MTF Mapper only considers edges that form part of a complete square. To keep the comparison fair, I discarded the results from the top and bottom rows of squares (as extracted by SFRPlus), leaving us with 19*4 edges per image (SFRPlus ignores the third square in the middle column).

Validating the test images

(This section can be skipped if you trust my methodology)

Although I have posted quite a few posts here on this blog regarding the algorithms used by mtf_generate_rectangle to render synthetic images, I will now show from first principles that the synthetic images truely have the claimed point spread functions (PSFs), and thus known MTFs.

I rendered the synthetic image using a command like this:

mtf_generate_rectangle.exe --b16 --pattern-noise 0.0085 --read-noise 2.5 --adc-gain 0.641 --adc-depth 12 -c 0.33 --target-poly sfrchart.txt -m 0.35 -p gaussian-sampled --airy-samples 100

This particular command renders the SFRPlus chart using a Gaussian PSF with an MTF50 value of 0.35. Reasonably realistic sensor noise is simulated, including photon shot noise, which implies that the noise standard deviation scales as the square root of the signal level; in plain English: we have more noise in bright parts of the image.

I ran a version of  mtf_mapper that dumped the raw samples extracted from the image (normally used to construct the binned ESF); I specified the edge angle as 5 degrees to remove all possible sources of error. NB: the "raw_esf_values.txt" file produced by MTF Mapper contains the binned ESF, and is not suitable for this particular experiment because of the smoothing inherent in the binning.

Given that I specified an MTF50 value of 0.35 cycles per pixel, we know that the standard deviation of the true PSF should be 0.5354018 pixels [ sqrt( log(0.5)/(-2*pi*pi*0.35*0.35) ]. From this we can calculate the expected analytical ESF, which is simply erf(x/sigma)*(upper-lower) + lower, where erf() is the standard "error function", defined as the integral of the unit Gaussian. The values upper and lower merely represent the mean white and black levels, which were defined as lower = 65536*0.33/2 and upper = 65536 - lower. With these values, I can now plot the expected analytical ESF along with the raw ESF samples dumped by MTF Mapper. 

Figure 2: Raw ESF samples along with analytical ESF
I should mention that I shifted the analytical ESF along the "d" axis to compensate for any residual bias in MTF Mapper's edge position estimate. We can see that the overall shape of the analytical ESF appears to line up quite well with the ESF samples extracted from the synthetic image. Next we look at the difference between the two curves:
Figure 3: ESF difference
 
We see two things in Figure 3: The mean difference appears to be close to zero, and the noise magnitude appears to increase with increasing signal levels (to the right). The increase in noise was expected, since that follows from the photon shot noise model used to simulate sensor noise. We can normalize the noise by dividing the ESF difference (noise) by the square root of the analytical ESF, which gives us this plot:

Figure 4: Normalised ESF difference
This normalization appears to keep the noise standard deviation constant, which would be consistent with garden-variety additive Gaussian white noise. The density estimate of the normalized noise looks Gaussian:
Figure 5: Normalized ESF difference density
Running the normalized residuals through the Shapiro-Wilk normality test gives us a p-value of 0.03722 over our 3285 samples. That is bad news, because it means our data is non-Gaussian at a 5% significance level. It is, however, Gaussian at a 10% confidence level. Correction: The normalized residuals are Gaussian at a 3% (or 2.5%, or 1%) significance level. The qqnorm() plot is pretty straight too, which tells us it is more likely that the Shapiro-Wilk test is negatively affected by the large number of samples, than that the residuals are truely not Gaussian.

Now that we have confirmed that the distribution of the residuals are Gaussian, we can fit a line through them. This line comes out with a slope of -0.005765, which means that our normalized residuals are fairly flat. Lastly, we can perform some LOESS smoothing on the normalized residuals:
Figure 6: LOESS fit on normalized ESF difference
Again, we can see that the LOESS-smoothed values oscillate around 0, i.e., there is no trend in the difference between the analyical ESF and the ESF measured from our synthetic image.

The mean signal-to-noise ratio in the bright regions of the images comes out at around 15dB; because we compute the LSF (or PSF if you prefer)) from the derivative of the ESF, the bright parts of the image are representative of the worst-case noise. Alternatively, we can say that the noise is quite similar to that produced by a Nikon D7000 at ISO400, for an SRFplus test chart at a 5:1 contrast ratio.

I have shown that there is no systematic difference between the ESF extracted from a synthetic image and the expected analytical ESF. The simulated noise also behaves in the way that we would expect from properties of the simulated sensor. Based on these observations, we can safely assume that the synthetic images have the desired PSF, i.e., the simulated MTF50 values are spot-on. (In previous posts I examined the properties of the simulated ESF values in the absence of noise, but here I chose to demonstrate the PSF properties directly on the actual images used in the Imatest vs MTF Mapper comparison).


The results

The results presented here were obtained by running Imatest 4.1.12 and MTF Mapper 0.4.18 on these images (about 100MB). SFRPlus (from Imatest, of course) was configured to enable the LSF correction that was recently introduced. Other than that, all settings were left to defaults, including leaving the apodization option enabled. I turned off the "quick mtf" option, although I did not check to see whether this affected the results. After a run of SFRPlus, the "save data" option was used to store the results, after which the "MTF50" column values were extracted, discarding the top and bottom row edges as explained before.

MTF Mapper was run using the "-t 0.5 -r" settings; the "-t 0.5" option is required to allow MTF Mapper to work with the rather low  5:1 contrast ratio. The values output to "raw_mtf_values.txt" were used as the representative MTF50 values extracted by MTF Mapper.

Simulated images were produced over the MTF50 range 0.1 cycles/pixel to 0.7 cycles/pixel in increments of 0.05 cycles/pixel, with one extra data point at 0.08 cycles/pixel to represent the low end (which is quite blurry). For each MTF50 level a total of three images were simulated, each with a different seed to produce unique sensor noise. This gives us 19*3*4 = 228 samples at each MTF50 level.  

As in previous posts, the results will be evaluated in two ways: bias and variance. The first plots to consider illustrate both bias and variance simultaneously, although it is somewhat harder to compare the variance of the methods on these plots.
Figure 7: Imatest relative error boxplot
Figure 8: MTF Mapper relative error boxplot
In figures 7 and 8, the relative difference (or error) is calculated as 100*(measured_mtf50 - expected_mtf50)/expected_mtf50. It is clear that Imatest 4.1.12 underestimates MTF50 values sligthly for MTF50 values above 0.2 cycles/pixel; this pattern is typical of what one would expect if the MTF curve is not adequately corrected for the low-pass filtering effect of the ESF binning step (see this post; ). MTF Mapper corrects for this low-pass filtering effect, producing no clear trend in median MTF50 error over the range considered. We can plot the median measured MTF50 relative error for Imatest and MTF Mapper on the same plot:
Figure 9: Median relative MTF50 error comparison

Figure 9 shows us that the Imatest bias is not all that severe; it remains below 2% over the range of MTF50 values we are likely to encounter in actual photos. (NB: Up to July 30, 2015, this figure had Imatest and MTF Mapper swapped around).

So that illustrates bias. To measure variance we can plot the standard deviation at each MTF50 level:
Figure 10: Standard deviation of relative MTF50 error
Other than at very low MTF50 values (say, 0.08 cycles/pixel and lower), it would appear that MTF Mapper 0.4.18 produces more consistent MTF50 measurements than Imatest 4.1.12.

A final performance metric to consider is the 95th percentile of relative MTF50 error. By computing this value on the absolute value of the relative error, it combines both variance and bias into a single measurement that tells us how close our measurements will be to the true MTF50 value, in 95% of measurements. Here is the plot:
Figure 11: 95th percentile of MTF50 error
Of all the performance metrics presented here, I consider Figure 11 to be the most practical measure of accuracy.

Conclusion

It took quite a bit of effort on my part to improve MTF Mapper to the point where it produces more accurate results than Imatest. There are some other aspects I have not touched on here, such as how accuracy varies with edge orientation. For now, I will say that MTF Mapper produces accurate results at known critical angles, whereas Imatest appears to fail at an angle of 26.565 degrees. Given that Imatest never claimed to work well at angles other than 5 degrees, I will let that one slide.

I have also not included any comparisons to other freely available slanted edge implementations (sfrmat, Quick MTF, the slanted edge ImageJ plugin, mitreSFR). I can tell you from informal testing that most of them appear to perform significantly worse than Imatest, mostly because none of those implementations appear to include the finite-difference-derivative correction. Maybe I will back this opinion up with some more detailed results in future.

So where does that leave your typical Imatest user? Well, the difference in accuracy between Imatest and MTF Mapper is relatively small. What I mean by that is that these results do not imply that Imatest users have to switch over to using MTF Mapper, rather, these results show that MTF Mapper users can trust their measurements to be at least as good as those obtained by Imatest. And, of course, MTF Mapper is free, and the source code is available.

There are some fairly nifty features that I noticed in SFRPlus during this experiment. It appears that SFRPlus will perform lens correction automatically, meaning that radial distortion curvature can be corrected for on the fly. MTF Mapper currently limits the length of the edge it will include in the analysis as a means of avoiding the effects of strong radial distortion. But now that I am aware of this feature, I think it would be relatively straightforward to include lens distortion correction in MTF Mapper. So little time, so many neat ideas to play with ...


No comments:

Post a Comment