plenoptic.process.SteerablePyramidFreq#
Note
This object is a torch.nn.Module. It therefore has all the methods and attributes
from that class, even though they are not documented here (to avoid cluttering this page).
- class plenoptic.process.SteerablePyramidFreq(image_shape, height='auto', order=3, twidth=1, is_complex=False, downsample=True, tight_frame=False)[source]#
Bases:
ModuleSteerable frequency pyramid in Torch.
Construct a steerable pyramid on matrix two dimensional signals, in the Fourier domain. Boundary-handling is circular. Reconstruction is exact (within floating point errors). However, if the image has an odd-shape, the reconstruction will not be exact due to boundary-handling issues that have not been resolved. Similarly, a complex pyramid of order=0 has non-exact reconstruction and cannot be tight-frame.
The squared radial functions tile the Fourier plane with a raised-cosine falloff. Angular functions are
\[\cos\left(\frac{\theta-k*\pi}{o+1}\right)^o\]where \(o\) is the order parameter set at initialization and \(k\) runs from 0 to \(o\) for a total of \(o+1\) orientations.
- Parameters:
height (
Literal['auto'] |int(default:'auto')) – The height of the pyramid. If'auto', will automatically determine based on the size ofimage. If anint, must be non-negative and less thanlog2(min(image_shape[1], image_shape[1]))-2. Ifheight=0, this only returns the residuals.order (
int(default:3)) – The Gaussian derivative order used for the steerable filters, in[0, 15]. Note that to achieve steerability the minimum number of orientation isorder + 1, which is used here. To get more orientations at the same order, use the methodsteer_coeffs.twidth (
int(default:1)) – The width of the transition region of the radial lowpass function, in octaves.is_complex (
bool(default:False)) – Whether the pyramid coefficients should be complex or not. IfTrue, the real and imaginary parts correspond to a pair of odd and even symmetric filters. IfFalse, the coefficients only include the real part. Regardless of the value ofis_complex, the symmetry of the real part is determined by theorderparameter: iforderis even, then the real coefficients are even symmetric; iforderis odd, then the real coefficients are odd symmetric. (Ifis_complex=True, then the imaginary coefficients will have the opposite symmetry of the real ones).downsample (
bool(default:True)) – Whether to downsample each scale in the pyramid or keep the output pyramid coefficients in fixed bands of sizeimage_shape. When downsample isFalse, the forward method returns a tensor.tight_frame (
bool(default:False)) – Whether the pyramid obeys the generalized parseval theorem or not (i.e. is a tight frame). IfTrue, the energy of the pyr_coeffs equals the energy of the image. In order to match the matlabPyrTools or pyrtools implementations, this must be set toFalse.
- pyr_size#
Dictionary containing the height and width of the pyramid coefficients. Keys are the same as those in
pyr_coeffsreturned byforward, in order:"residual_highpass", the integers from0to (the initialization argument)order, and"residual_lowpass". The values are 2-tuples of ints. While the dictionary is initialized with the object, the values are not set until the first timeforwardis called.- Type:
OrderedDict
- fft_norm#
The way the ffts are normalized, see
torch.fft.fft2for more details.- Type:
- scales#
All the scales of the representation (including residuals) in coarse-to-fine order. A subset of this list can be passed to the
forwardmethod to restrict the output.- Type:
- Raises:
ValueError – If
image_shapecontains non-integers.ValueError – If
len(image_shape) != 2.ValueError – If
heightis not a non-negative integer or is larger than the biggest possible value (determined byimage_shape).ValueError – If
ordernot an integer in[0, 15].ValueError – If
order == 0andis_complex is False. See plenoptic-org/plenoptic#326 for an explanationValueError – If
twidthnot positive.
- Warns:
UserWarning – If
image_shapehas an odd value, because then reconstruction will be imperfect.
Notes
Transform described in Simoncelli and Freeman, 1995 [1], filter kernel design described in Karasaridis and Smoncelli, 1996 [2]. For further information see online [3].
References
Examples
>>> import plenoptic as po >>> spyr = po.process.SteerablePyramidFreq((256, 256))
Methods
convert_pyr_to_tensor(pyr_coeffs[, ...])Convert coefficient dictionary to a tensor.
convert_tensor_to_pyr(pyr_tensor, ...)Convert pyramid coefficient tensor to dictionary format.
forward(image[, scales])Generate the steerable pyramid coefficients for an image.
recon_pyr(pyr_coeffs[, levels, bands])Reconstruct image from coefficients, optionally using a subset.
steer_coeffs(pyr_coeffs, angles[, even_phase])Steer pyramid coefficients to the specified angles.
- static convert_pyr_to_tensor(pyr_coeffs, split_complex=False)[source]#
Convert coefficient dictionary to a tensor.
The output tensor has shape (batch, channel, height, width) and is intended to be used in an
torch.nn.Moduledownstream. In the multichannel case, all bands for each channel will be stacked together (i.e. if there are 2 channels and 18 bands per channel,pyr_tensor[:,0:18,...]will contain the pyr responses for channel 1 andpyr_tensor[:, 18:36, ...]will contain the responses for channel 2). In the case of a complex, multichannel pyramid withsplit_complex=True, the real/imaginary bands will be intereleaved so that they appear as pairs with neighboring indices in the channel dimension of the tensor. (Note: the residual bands are always real so they will only ever have a single band even whensplit_complex=True.)This only works if
pyr_coeffswas created with a pyramid withdownsample=False- Parameters:
pyr_coeffs (
OrderedDict) – The pyramid coefficients.split_complex (
bool(default:False)) – Indicates whether the output should split complex bands into real/imag channels or keep them as a single channel. This should beTrueif you intend to use a convolutional layer on top of the output.
- Return type:
tuple[Tensor,tuple[int,list[int|Literal['residual_lowpass','residual_highpass']],list[Size],list[Size] |bool]]- Returns:
pyr_tensor – Tensor with shape (batch, channel, height, width). pyramid coefficients reshaped into tensor. The first channel will be the residual highpass and the last will be the residual lowpass. Each band is then a separate channel, going from fine to coarse (i.e., starting with all of scale 0’s orientations, then scale 1’s, etc.).
pyr_info – Information required to recreate the dictionary, containing the number of channels, the list of pyramid keys for the dictionary, info on how to unpack the coefficients, and info on how
split_complexwas used.
- Raises:
RuntimeError – If
self.downsample is True. In this case, we can’t concatenate across scales, because each scale is a different size.
See also
convert_tensor_to_pyrConvert tensor representation to pyramid dictionary.
Examples
>>> import plenoptic as po >>> img = po.data.einstein() >>> spyr = po.process.SteerablePyramidFreq(img.shape[-2:], downsample=False) >>> coeffs = spyr(img) >>> coeffs_tensor, _ = spyr.convert_pyr_to_tensor(coeffs) >>> coeffs_tensor.shape torch.Size([1, 26, 256, 256]) >>> # rearrange so that the residuals are at the end >>> coeffs_tensor = [ ... coeffs_tensor[:, 1:-1], ... coeffs_tensor[:, :1], ... coeffs_tensor[:, -1:], ... ] >>> po.plot.imshow(coeffs_tensor, col_wrap=spyr.num_orientations) <PyrFigure ...>
- static convert_tensor_to_pyr(pyr_tensor, num_channels, pyr_keys, pack_info, split_complex_pack_info)[source]#
Convert pyramid coefficient tensor to dictionary format.
The arguments other than
pyr_tensorare elements of thepyr_infotuple returned byconvert_pyr_to_tensor. You should always unpack the arguments for this function from thatpyr_infotuple. See Examples section below.- Parameters:
pyr_tensor (
Tensor) – Shape (batch, channel, height, width). The pyramid coefficients.num_channels (
int) – Number of channels in the original input tensor the pyramid was created for (i.e. if the input was an RGB image, this would be 3).pyr_keys (
list[int|Literal['residual_lowpass','residual_highpass']]) – Keys from the original pyramid dictionary.pack_info (
list[Size]) – List of sizes of the fifth dimension for each coefficient (i.e., the number of orientations) used to pack/unpack the tensors.split_complex_pack_info (
list[Size] |bool) – Ifconvert_pyr_to_tensorwas called withsplit_complex=True, another list of sizes used to pack/unpack the tensors. Else,False.
- Return type:
- Returns:
pyr_coeffs – Pyramid coefficients in dictionary format as returned by
forward.
See also
convert_pyr_to_tensorConvert pyramid dictionary representation to tensor.
Examples
>>> import plenoptic as po >>> img = po.data.einstein() >>> spyr = po.process.SteerablePyramidFreq( ... img.shape[-2:], downsample=False, is_complex=True ... ) >>> coeffs = spyr(img) >>> coeffs_tensor, pyr_info = spyr.convert_pyr_to_tensor(coeffs) >>> coeffs_tensor.shape torch.Size([1, 26, 256, 256]) >>> coeffs_tensor.dtype torch.complex64 >>> new_coeffs = spyr.convert_tensor_to_pyr(coeffs_tensor, *pyr_info) >>> all([torch.equal(v, new_coeffs[k]) for k, v in coeffs.items()]) True >>> coeffs_tensor, pyr_info = spyr.convert_pyr_to_tensor( ... coeffs, split_complex=True ... ) >>> coeffs_tensor.shape torch.Size([1, 50, 256, 256]) >>> coeffs_tensor.dtype torch.float32 >>> new_coeffs = spyr.convert_tensor_to_pyr(coeffs_tensor, *pyr_info) >>> all([torch.equal(v, new_coeffs[k]) for k, v in coeffs.items()]) True
- forward(image, scales=None)[source]#
Generate the steerable pyramid coefficients for an image.
The steerable pyramid coefficients run from fine to coarse and split the image into subbands corresponding to different orientations and scales (i.e., spatial frequencies).
Changed in version 1.4: The returned
pyr_coeffsdictionary’s keys are now either strings specifying the residual or integers specifying the scale. The non-residual coefficients are now 5d tensors of shape (batch, channel, num_orientations, height, width).- Parameters:
image (
Tensor) – A tensor containing the image to analyze. We want to operate on this in the pytorch-y way, so we want it to be 4d (batch, channel, height, width).scales (
list[int|Literal['residual_lowpass','residual_highpass']] |None(default:None)) – Which scales to include in the returned representation. If None, we include all scales. Otherwise, can contain subset of values present in this model’sscalesattribute (ints from 0 up toself.num_scales-1and the strs ‘residual_highpass’ and ‘residual_lowpass’. Can contain a single value or multiple values. If it’s an int, we include all orientations from that scale. Order within the list does not matter.
- Return type:
- Returns:
pyr_coeffs – Pyramid coefficients. These will be stored in an ordered dictionary with keys that are, in order:
"residual_highpass", the integers from0to (the initialization argument)order, and"residual_lowpass". Coefficients have shape(*image.shape[:2], self.num_orientations, image.shape[2] / 2**scale, image.shape[3] / 2**scale), with the"residual_highpass"height and width matching that ofimage, and"residual_lowpass"having height and width(image.shape[2] / 2**self.num_scales, image.shape[3] / 2**self.num_scales). They are ordered from fine to coarse:"residual_highpass", 0, 1, ..., num_scales-1, "residual_lowpass".- Raises:
ValueError – If
imageis the wrong shape, i.e.image.shape[-2:] != self.image_shape.
Examples
>>> import plenoptic as po >>> img = po.data.einstein() >>> spyr = po.process.SteerablePyramidFreq(img.shape[-2:]) >>> po.plot.pyrshow(spyr(img)) <PyrFigure ...>
- recon_pyr(pyr_coeffs, levels='all', bands='all')[source]#
Reconstruct image from coefficients, optionally using a subset.
- Parameters:
pyr_coeffs (
OrderedDict) – Pyramid coefficients to reconstruct from.levels (
Literal['all'] |list[int|Literal['residual_lowpass','residual_highpass']] (default:'all')) – Iflistshould contain some subset of integers from0toself.num_scales-1(inclusive),"residual_lowpass", and"residual_highpass". If"all", returned value will contain all valid levels. Otherwise, must be one of the valid levels.bands (
Literal['all'] |list[int] (default:'all')) – If list, should contain some subset of integers from0toself.num_orientations-1. If"all", returned value will contain all valid orientations. Otherwise, must be one of the valid orientations.
- Return type:
- Returns:
recon – The reconstructed image, of shape (batch, channel, height, width).
- Raises:
ValueError – If
self.forward()was called withscalesargument notNone.TypeError – If
levelsis not one of the allowed values.TypeError – If
bandsis not an integer or"all".ValueError – If
bandsis an integer outside of the range[0, self.num_orientations-1].
Examples
>>> import plenoptic as po >>> import torch >>> img = po.data.einstein() >>> spyr = po.process.SteerablePyramidFreq(img.shape[-2:]) >>> coeffs = spyr(img) >>> recon = spyr.recon_pyr(coeffs) >>> torch.allclose(recon, img, rtol=1e-8, atol=1e-5) True >>> titles = ["Original", "Reconstructed", "Difference"] >>> po.plot.imshow([img, recon, img - recon], title=titles) <PyrFigure ...>
- steer_coeffs(pyr_coeffs, angles, even_phase=True)[source]#
Steer pyramid coefficients to the specified angles.
This allows you to have filters that have the Gaussian derivative order specified in construction, but arbitrary angles or number of orientations.
Changed in version 1.4: The returned
resteered_coeffsdictionary now only contains the new angles, as opposed to concatenating the new angles onto those found in the inputpyr_coeffs. Like the inputpyr_coeffs, the dictionary keys are now integers specifying the scale and the coefficients are 5d tensors of shape (batch, channel, angles, height, width).- Parameters:
pyr_coeffs (
OrderedDict) – The pyramid coefficients to steer, as returned byforward.angles (
list[float]) – List of angles (in radians) to steer the pyramid coefficients to.even_phase (
bool(default:True)) – Specifies whether the harmonics are cosine or sine phase aligned about those positions.
- Return type:
- Returns:
resteered_coeffs – Dictionary of re-steered pyramid coefficients. will have the same number of scales as the original pyramid (though it will not contain the residual highpass or lowpass). Like the input
pyr_coeffs, keys are ints indexing the scale and values are tensors of shape (batch, channel, orientations, height, width), but now orientations indexanglesinstead ofself.num_orientations.resteering_weights – Dictionary of weights used to re-steer the pyramid coefficients. will have the same keys as
resteered_coeffs.
Examples
>>> import plenoptic as po >>> import torch >>> img = po.data.einstein() >>> spyr = po.process.SteerablePyramidFreq(img.shape[-2:], height=3) >>> coeffs = spyr(img) >>> resteered_coeffs, resteering_weights = spyr.steer_coeffs( ... coeffs, torch.linspace(0, 2 * torch.pi, 64) ... ) >>> ani = po.plot.animshow( ... resteered_coeffs[2], repeat=True, framerate=6, zoom=4 ... ) >>> # Save the video (here we're saving it as a .gif) >>> ani.save("resteered_coeffs.gif")