plenoptic.metric.ssim#
- plenoptic.metric.ssim(img1, img2, weighted=False, pad=False)[source]#
Compute the structural similarity index.
As described in Wang et al., 2004 [1], the structural similarity index (SSIM) is a perceptual distance metric, giving the distance between two images. SSIM is based on three comparison measurements between the two images: luminance, contrast, and structure. All of these are computed convolutionally across the images. See the references for more information.
This implementation follows the original implementation, as found online [2], as well as providing the option to use the weighted version used in Wang and Simoncelli, 2008 [4] (which was shown to consistently improve the image quality prediction on the LIVE database). More info can be found online [3].
Note that this is a similarity metric (not a distance), and so 1 means the two images are identical and 0 means they’re very different. When the two images are negatively correlated, SSIM can be negative. SSIM is bounded between -1 and 1.
This function returns the mean SSIM, a scalar-valued metric giving the average over the whole image. For the SSIM map (showing the computed value across the image), call
ssim_map.- Parameters:
img1 (
Tensor) – The first image or batch of images, of shape (batch, channel, height, width).img2 (
Tensor) – The second image or batch of images, of shape (batch, channel, height, width). The heights and widths ofimg1andimg2must be the same. The numbers of batches and channels ofimg1andimg2need to be broadcastable: either they are the same or one of them is 1. The output will be computed separately for each channel (so channels are treated in the same way as batches). Both images should have values between 0 and 1. Otherwise, the result may be inaccurate, and we will raise a warning (but will still compute it).weighted (
bool(default:False)) – Whether to use the original, unweighted SSIM version (False) as used in [1] or the weighted version (True) as used in [4]. See Notes section for the weight.pad (
Literal[False,'constant','reflect','replicate','circular'] (default:False)) – If notFalse, how to pad the image for the convolutions computing the local average of each image. Seetorch.nn.functional.padfor how these work.
- Return type:
- Returns:
mssim – 2d tensor of shape (batch, channel) containing the mean SSIM for each image, averaged over the whole image.
- Raises:
ValueError – If either
img1orimg2is not 4d.ValueError – If
img1andimg2have different height or width.ValueError – If
img1andimg2have different batch or channel, unless one of them has a 1 there, so they can be broadcast.ValueError – If
img1andimg2have different dtypes.
- Warns:
UserWarning – If either
img1orimg2has multiple channels, as SSIM was designed for grayscale images.UserWarning – If at least one scale from either
img1orimg2has height or width of less than 11, since SSIM uses an 11x11 convolutional kernel.
Notes
The weight used when
weighted=Trueis:\[\log((1+\frac{\sigma_1^2}{C_2})(1+\frac{\sigma_2^2}{C_2}))\]where \(\sigma_1^2\) and \(\sigma_2^2\) are the variances of
img1andimg2, respectively, and \(C_2\) is a constant. See [4] for more details.References
Examples
>>> import plenoptic as po >>> import torch >>> po.set_seed(0) >>> img = po.data.einstein() >>> po.metric.ssim(img, img + torch.rand_like(img)) tensor([[0.0519]])