plenoptic.process.SteerablePyramidFreq#

Note

This object is a torch.nn.Module. It therefore has all the methods and attributes from that class, even though they are not documented here (to avoid cluttering this page).

class plenoptic.process.SteerablePyramidFreq(image_shape, height='auto', order=3, twidth=1, is_complex=False, downsample=True, tight_frame=False)[source]#

Bases: Module

Steerable frequency pyramid in Torch.

Construct a steerable pyramid on matrix two dimensional signals, in the Fourier domain. Boundary-handling is circular. Reconstruction is exact (within floating point errors). However, if the image has an odd-shape, the reconstruction will not be exact due to boundary-handling issues that have not been resolved. Similarly, a complex pyramid of order=0 has non-exact reconstruction and cannot be tight-frame.

The squared radial functions tile the Fourier plane with a raised-cosine falloff. Angular functions are

\[\cos\left(\frac{\theta-k*\pi}{o+1}\right)^o\]

where \(o\) is the order parameter set at initialization and \(k\) runs from 0 to \(o\) for a total of \(o+1\) orientations.

Parameters:

image_shape (tuple[int, int]) – Shape of input image.
height (Literal['auto'] | int (default: 'auto')) – The height of the pyramid. If 'auto', will automatically determine based on the size of image. If an int, must be non-negative and less than log2(min(image_shape[1], image_shape[1]))-2. If height=0, this only returns the residuals.
order (int (default: 3)) – The Gaussian derivative order used for the steerable filters, in [0, 15]. Note that to achieve steerability the minimum number of orientation is order + 1, which is used here. To get more orientations at the same order, use the method steer_coeffs.
twidth (int (default: 1)) – The width of the transition region of the radial lowpass function, in octaves.
is_complex (bool (default: False)) – Whether the pyramid coefficients should be complex or not. If True, the real and imaginary parts correspond to a pair of odd and even symmetric filters. If False, the coefficients only include the real part. Regardless of the value of is_complex, the symmetry of the real part is determined by the order parameter: if order is even, then the real coefficients are even symmetric; if order is odd, then the real coefficients are odd symmetric. (If is_complex=True, then the imaginary coefficients will have the opposite symmetry of the real ones).
downsample (bool (default: True)) – Whether to downsample each scale in the pyramid or keep the output pyramid coefficients in fixed bands of size image_shape. When downsample is False, the forward method returns a tensor.
tight_frame (bool (default: False)) – Whether the pyramid obeys the generalized parseval theorem or not (i.e. is a tight frame). If True, the energy of the pyr_coeffs equals the energy of the image. In order to match the matlabPyrTools or pyrtools implementations, this must be set to False.

image_shape#

Shape of input image.

Type:: tuple

pyr_size#

Dictionary containing the height and width of the pyramid coefficients. Keys are the same as those in pyr_coeffs returned by forward, in order: "residual_highpass", the integers from 0 to (the initialization argument) order, and "residual_lowpass". The values are 2-tuples of ints. While the dictionary is initialized with the object, the values are not set until the first time forward is called.

Type:: OrderedDict

fft_norm#

The way the ffts are normalized, see torch.fft.fft2 for more details.

Type:: str

is_complex#

Whether the coefficients are complex- or real-valued.

Type:: bool

scales#

All the scales of the representation (including residuals) in coarse-to-fine order. A subset of this list can be passed to the forward method to restrict the output.

Type:: list

Raises:

ValueError – If image_shape contains non-integers.
ValueError – If len(image_shape) != 2 .
ValueError – If height is not a non-negative integer or is larger than the biggest possible value (determined by image_shape).
ValueError – If order not an integer in [0, 15].
ValueError – If order == 0 and is_complex is False. See plenoptic-org/plenoptic#326 for an explanation
ValueError – If twidth not positive.

Warns:

UserWarning – If image_shape has an odd value, because then reconstruction will be imperfect.

Notes

Transform described in Simoncelli and Freeman, 1995 [1], filter kernel design described in Karasaridis and Smoncelli, 1996 [2]. For further information see online [3].

References

Examples

>>> import plenoptic as po
>>> spyr = po.process.SteerablePyramidFreq((256, 256))

Methods

`convert_pyr_to_tensor`(pyr_coeffs[, ...])	Convert coefficient dictionary to a tensor.
`convert_tensor_to_pyr`(pyr_tensor, ...)	Convert pyramid coefficient tensor to dictionary format.
`forward`(image[, scales])	Generate the steerable pyramid coefficients for an image.
`recon_pyr`(pyr_coeffs[, levels, bands])	Reconstruct image from coefficients, optionally using a subset.
`steer_coeffs`(pyr_coeffs, angles[, even_phase])	Steer pyramid coefficients to the specified angles.

static convert_pyr_to_tensor(pyr_coeffs, split_complex=False)[source]#

Convert coefficient dictionary to a tensor.

The output tensor has shape (batch, channel, height, width) and is intended to be used in an torch.nn.Module downstream. In the multichannel case, all bands for each channel will be stacked together (i.e. if there are 2 channels and 18 bands per channel, pyr_tensor[:,0:18,...] will contain the pyr responses for channel 1 and pyr_tensor[:, 18:36, ...] will contain the responses for channel 2). In the case of a complex, multichannel pyramid with split_complex=True, the real/imaginary bands will be intereleaved so that they appear as pairs with neighboring indices in the channel dimension of the tensor. (Note: the residual bands are always real so they will only ever have a single band even when split_complex=True.)

This only works if pyr_coeffs was created with a pyramid with downsample=False

Parameters:

pyr_coeffs (OrderedDict) – The pyramid coefficients.
split_complex (bool (default: False)) – Indicates whether the output should split complex bands into real/imag channels or keep them as a single channel. This should be True if you intend to use a convolutional layer on top of the output.

Return type:

tuple[Tensor, tuple[int, list[int | Literal['residual_lowpass', 'residual_highpass']], list[Size], list[Size] | bool]]

Returns:

pyr_tensor – Tensor with shape (batch, channel, height, width). pyramid coefficients reshaped into tensor. The first channel will be the residual highpass and the last will be the residual lowpass. Each band is then a separate channel, going from fine to coarse (i.e., starting with all of scale 0’s orientations, then scale 1’s, etc.).
pyr_info – Information required to recreate the dictionary, containing the number of channels, the list of pyramid keys for the dictionary, info on how to unpack the coefficients, and info on how split_complex was used.

Raises:

RuntimeError – If self.downsample is True. In this case, we can’t concatenate across scales, because each scale is a different size.

See also

convert_pyr_to_tensor: Convert pyramid dictionary representation to tensor.

Examples

>>> import plenoptic as po
>>> img = po.data.einstein()
>>> spyr = po.process.SteerablePyramidFreq(
...     img.shape[-2:], downsample=False, is_complex=True
... )
>>> coeffs = spyr(img)
>>> coeffs_tensor, pyr_info = spyr.convert_pyr_to_tensor(coeffs)
>>> coeffs_tensor.shape
torch.Size([1, 26, 256, 256])
>>> coeffs_tensor.dtype
torch.complex64
>>> new_coeffs = spyr.convert_tensor_to_pyr(coeffs_tensor, *pyr_info)
>>> all([torch.equal(v, new_coeffs[k]) for k, v in coeffs.items()])
True
>>> coeffs_tensor, pyr_info = spyr.convert_pyr_to_tensor(
...     coeffs, split_complex=True
... )
>>> coeffs_tensor.shape
torch.Size([1, 50, 256, 256])
>>> coeffs_tensor.dtype
torch.float32
>>> new_coeffs = spyr.convert_tensor_to_pyr(coeffs_tensor, *pyr_info)
>>> all([torch.equal(v, new_coeffs[k]) for k, v in coeffs.items()])
True

forward(image, scales=None)[source]#

Generate the steerable pyramid coefficients for an image.

The steerable pyramid coefficients run from fine to coarse and split the image into subbands corresponding to different orientations and scales (i.e., spatial frequencies).

Changed in version 1.4: The returned pyr_coeffs dictionary’s keys are now either strings specifying the residual or integers specifying the scale. The non-residual coefficients are now 5d tensors of shape (batch, channel, num_orientations, height, width).

Parameters:

image (Tensor) – A tensor containing the image to analyze. We want to operate on this in the pytorch-y way, so we want it to be 4d (batch, channel, height, width).
scales (list[int | Literal['residual_lowpass', 'residual_highpass']] | None (default: None)) – Which scales to include in the returned representation. If None, we include all scales. Otherwise, can contain subset of values present in this model’s scales attribute (ints from 0 up to self.num_scales-1 and the strs ‘residual_highpass’ and ‘residual_lowpass’. Can contain a single value or multiple values. If it’s an int, we include all orientations from that scale. Order within the list does not matter.

Return type:

OrderedDict

Returns:

pyr_coeffs – Pyramid coefficients. These will be stored in an ordered dictionary with keys that are, in order: "residual_highpass", the integers from 0 to (the initialization argument) order, and "residual_lowpass". Coefficients have shape (*image.shape[:2], self.num_orientations, image.shape[2] / 2**scale, image.shape[3] / 2**scale), with the "residual_highpass" height and width matching that of image, and "residual_lowpass" having height and width (image.shape[2] / 2**self.num_scales, image.shape[3] / 2**self.num_scales). They are ordered from fine to coarse: "residual_highpass", 0, 1, ..., num_scales-1, "residual_lowpass".

Raises:

ValueError – If image is the wrong shape, i.e. image.shape[-2:] != self.image_shape.

Examples

>>> import plenoptic as po
>>> img = po.data.einstein()
>>> spyr = po.process.SteerablePyramidFreq(img.shape[-2:])
>>> po.plot.pyrshow(spyr(img))
<PyrFigure ...>

(png, hires.png, pdf)

../../_images/plenoptic-process-SteerablePyramidFreq-2.png

recon_pyr(pyr_coeffs, levels='all', bands='all')[source]#

Reconstruct image from coefficients, optionally using a subset.

Parameters:

pyr_coeffs (OrderedDict) – Pyramid coefficients to reconstruct from.
levels (Literal['all'] | list[int | Literal['residual_lowpass', 'residual_highpass']] (default: 'all')) – If list should contain some subset of integers from 0 to self.num_scales-1 (inclusive), "residual_lowpass", and "residual_highpass". If "all", returned value will contain all valid levels. Otherwise, must be one of the valid levels.
bands (Literal['all'] | list[int] (default: 'all')) – If list, should contain some subset of integers from 0 to self.num_orientations-1. If "all", returned value will contain all valid orientations. Otherwise, must be one of the valid orientations.

Return type:

Tensor

Returns:

recon – The reconstructed image, of shape (batch, channel, height, width).

Raises:

ValueError – If self.forward() was called with scales argument not None.
TypeError – If levels is not one of the allowed values.
TypeError – If bands is not an integer or "all" .
ValueError – If bands is an integer outside of the range [0, self.num_orientations-1].

Examples

>>> import plenoptic as po
>>> import torch
>>> img = po.data.einstein()
>>> spyr = po.process.SteerablePyramidFreq(img.shape[-2:])
>>> coeffs = spyr(img)
>>> recon = spyr.recon_pyr(coeffs)
>>> torch.allclose(recon, img, rtol=1e-8, atol=1e-5)
True
>>> titles = ["Original", "Reconstructed", "Difference"]
>>> po.plot.imshow([img, recon, img - recon], title=titles)
<PyrFigure ...>

(png, hires.png, pdf)

../../_images/plenoptic-process-SteerablePyramidFreq-3.png

steer_coeffs(pyr_coeffs, angles, even_phase=True)[source]#

Steer pyramid coefficients to the specified angles.

This allows you to have filters that have the Gaussian derivative order specified in construction, but arbitrary angles or number of orientations.

Changed in version 1.4: The returned resteered_coeffs dictionary now only contains the new angles, as opposed to concatenating the new angles onto those found in the input pyr_coeffs. Like the input pyr_coeffs, the dictionary keys are now integers specifying the scale and the coefficients are 5d tensors of shape (batch, channel, angles, height, width).

Parameters:

pyr_coeffs (OrderedDict) – The pyramid coefficients to steer, as returned by forward.
angles (list[float]) – List of angles (in radians) to steer the pyramid coefficients to.
even_phase (bool (default: True)) – Specifies whether the harmonics are cosine or sine phase aligned about those positions.

Return type:

tuple[dict, dict]

Returns:

resteered_coeffs – Dictionary of re-steered pyramid coefficients. will have the same number of scales as the original pyramid (though it will not contain the residual highpass or lowpass). Like the input pyr_coeffs, keys are ints indexing the scale and values are tensors of shape (batch, channel, orientations, height, width), but now orientations index angles instead of self.num_orientations.
resteering_weights – Dictionary of weights used to re-steer the pyramid coefficients. will have the same keys as resteered_coeffs.

Examples

>>> import plenoptic as po
>>> import torch
>>> img = po.data.einstein()
>>> spyr = po.process.SteerablePyramidFreq(img.shape[-2:], height=3)
>>> coeffs = spyr(img)
>>> resteered_coeffs, resteering_weights = spyr.steer_coeffs(
...     coeffs, torch.linspace(0, 2 * torch.pi, 64)
... )
>>> ani = po.plot.animshow(
...     resteered_coeffs[2], repeat=True, framerate=6, zoom=4
... )
>>> # Save the video (here we're saving it as a .gif)
>>> ani.save("resteered_coeffs.gif")