plenoptic
plenoptic
is a python library for model-based synthesis of perceptual stimuli. For plenoptic
, models are those of visual [1] information processing: they accept an image as input, perform some computations, and return some output, which can be mapped to neuronal firing rate, fMRI BOLD response, behavior on some task, image category, etc. The intended audience is researchers in neuroscience, psychology, and machine learning. The generated stimuli enable interpretation of model properties through examination of features that are enhanced, suppressed, or discarded. More importantly, they can facilitate the scientific process, through use in further perceptual or neural experiments aimed at validating or falsifying model predictions.
Getting started
If you are unfamiliar with stimulus synthesis, see the Conceptual Introduction for an in-depth introduction.
Otherwise, see the Quickstart tutorial.
Installation
The best way to install plenoptic
is via pip
:
$ pip install plenoptic
or conda
:
$ conda install plenoptic -c conda-forge
Warning
We do not currently support conda installs on Windows, due to the lack of a Windows pytorch package on conda-forge. See here for the status of that issue.
See the Installation page for more details, including how to set up an isolated virtual environment (recommended).
ffmpeg and videos
Some methods in this package generate videos. There are several backends
available for saving the animations to file (see matplotlib documentation
).
To convert them to HTML5 for viewing (for example, in a
jupyter notebook), you’ll need ffmpeg
installed. Depending on your system, this might already
be installed, but if not, the easiest way is probably through conda: conda install -c conda-forge
ffmpeg
.
To change the backend, run matplotlib.rcParams['animation.writer'] = writer
before calling any of the animate functions. If you try to set that rcParam
with a random string, matplotlib
will list the available choices.
Contents
Synthesis methods
Metamers: given a model and a reference image, stochastically generate a new image whose model representation is identical to that of the reference image (a “metamer”, as originally defined in the literature on Trichromacy). This method makes explicit those features that the model retains/discards.
Example papers: [Portilla2000], [Freeman2011], [Deza2019], [Feather2019], [Wallis2019], [Ziemba2021]
Eigendistortions: given a model and a reference image, compute the image perturbations that produce the smallest/largest change in the model response space. These are the image changes to which the model is least/most sensitive, respectively.
Example papers: [Berardino2017]
Maximal differentiation (MAD) competition: given a reference image and two models that measure distance between images, generate pairs of images that optimally differentiate the models. Specifically, synthesize a pair of images that are equi-distant from the reference image according to model-1, but maximally/minimally distant according to model-2. Synthesize a second pair with the roles of the two models reversed. This method allows for efficient comparison of two metrics, highlighting the aspects in which their sensitivities most differ.
Example papers: [Wang2008]
Models, Metrics, and Model Components
Steerable pyramid, [Simoncelli1992] and [Simoncelli1995], a multi-scale oriented image decomposition. Images are decomposed with a family of oriented filters, localized in space and frequency, similar to the “Gabor functions” commonly used to model receptive fields in primary visual cortex. The critical difference is that the pyramid organizes these filters so as to effeciently cover the 4D space of (x,y) positions, orientations, and scales, enabling efficient interpolation and interpretation (further info ). See the pyrtools documentation for more details on python tools for image pyramids in general and the steerable pyramid in particular.
Portilla-Simoncelli texture model, [Portilla2000], which computes a set of image statistics that capture the appearance of visual textures (further info).
Structural Similarity Index (SSIM), [Wang2004], is a perceptual similarity metric, that takes two images and returns a value between -1 (totally different) and 1 (identical) reflecting their similarity (further info).
Multiscale Structural Similarity Index (MS-SSIM), [Wang2003], is an extension of SSIM that operates jointly over multiple scales.
Normalized Laplacian distance, [Laparra2016] and [Laparra2017], is a perceptual distance metric based on transformations associated with the early visual system: local luminance subtraction and local contrast gain control, at six scales (further info).
Getting help
We communicate via several channels on Github:
To report a bug, open an issue.
To send suggestions for extensions or enhancements, please post in the ideas section of discussions first. We’ll discuss it there and, if we decide to pursue it, open an issue to track progress.
To ask usage questions, discuss broad issues, or show off what you’ve made with plenoptic, go to Discussions.
To contribute to the project, see the contributing guide.
In all cases, we request that you respect our code of conduct.
Citing us
If you use plenoptic
in a published academic article or presentation, please
cite us! See the Citation Guide for more details.
Portilla, J., & Simoncelli, E. P. (2000). A parametric texture model based on joint statistics of complex wavelet coefficients. International journal of computer vision, 40(1), 49–70. https://www.cns.nyu.edu/~lcv/texture/. https://www.cns.nyu.edu/pub/eero/portilla99-reprint.pdf
Freeman, J., & Simoncelli, E. P. (2011). Metamers of the ventral stream. Nature Neuroscience, 14(9), 1195–1201. https://www.cns.nyu.edu/pub/eero/freeman10-reprint.pdf
Deza, A., Jonnalagadda, A., & Eckstein, M. P. (2019). Towards metamerism via foveated style transfer. In , International Conference on Learning Representations.
Feather, J., Durango, A., Gonzalez, R., & McDermott, J. (2019). Metamers of neural networks reveal divergence from human perceptual systems. In NeurIPS (pp. 10078–10089).
Wallis, T. S., Funke, C. M., Ecker, A. S., Gatys, L. A., Wichmann, F. A., & Bethge, M. (2019). Image content is more important than bouma’s law for scene metamers. eLife. https://dx.doi.org/10.7554/elife.42512
Berardino, A., Laparra, V., J Ball'e, & Simoncelli, E. P. (2017). Eigen-distortions of hierarchical representations. In I. Guyon, U. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett, Adv. Neural Information Processing Systems (NIPS*17) (pp. 1–10). : Curran Associates, Inc. https://www.cns.nyu.edu/~lcv/eigendistortions/ https://www.cns.nyu.edu/pub/lcv/berardino17c-final.pdf
Wang, Z., & Simoncelli, E. P. (2008). Maximum differentiation (MAD) competition: A methodology for comparing computational models of perceptual discriminability. Journal of Vision, 8(12), 1–13. https://ece.uwaterloo.ca/~z70wang/research/mad/ https://www.cns.nyu.edu/pub/lcv/wang08-preprint.pdf
Simoncelli, E. P., Freeman, W. T., Adelson, E. H., & Heeger, D. J. (1992). Shiftable Multi-Scale Transforms. IEEE Trans. Information Theory, 38(2), 587–607. https://dx.doi.org/10.1109/18.119725
Simoncelli, E. P., & Freeman, W. T. (1995). The steerable pyramid: A flexible architecture for multi-scale derivative computation. In , Proc 2nd IEEE Int’l Conf on Image Proc (ICIP) (pp. 444–447). Washington, DC: IEEE Sig Proc Society. https://www.cns.nyu.edu/pub/eero/simoncelli95b.pdf
Wang, Z., Bovik, A., Sheikh, H., & Simoncelli, E. (2004). Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4), 600–612. https://www.cns.nyu.edu/~lcv/ssim/. https://www.cns.nyu.edu/pub/lcv/wang03-reprint.pdf
Z Wang, E P Simoncelli and A C Bovik. Multiscale structural similarity for image quality assessment Proc 37th Asilomar Conf on Signals, Systems and Computers, vol.2 pp. 1398–1402, Nov 2003. https://www.cns.nyu.edu/pub/eero/wang03b.pdf
Laparra, V., Berardino, A., Johannes Ball'e, & Simoncelli, E. P. (2017). Perceptually Optimized Image Rendering. Journal of the Optical Society of America A, 34(9), 1511. https://www.cns.nyu.edu/pub/lcv/laparra17a.pdf
Laparra, V., Ballé, J., Berardino, A. and Simoncelli, E.P., 2016. Perceptual image quality assessment using a normalized Laplacian pyramid. Electronic Imaging, 2016(16), pp.1-6. https://www.cns.nyu.edu/pub/lcv/laparra16a-reprint.pdf
Ziemba, C.M., and Simoncelli, E.P. (2021). Opposing effects of selectivity and invariance in peripheral vision. Nature Communications, vol.12(4597). https://dx.doi.org/10.1038/s41467-021-24880-5
This package is supported by the Center for Computational Neuroscience, in the Flatiron Institute of the Simons Foundation.