IMPA VISGRAF 30 ANOS

Dual Photography

Introductory work on the application of Helmholtz Reciprocity in the photographic technique called Dual Photography

IMPA - Instituto de Matemática Pura e Aplicada

3D Graphics Systems: from New Media to A.I.

2020.1

Professor Luiz Velho

by Jonas Lopes

Dual Photography uses the concept of Helmholtz Reciprocity to interchange the lights and cameras in a scene. The concept of Dual Photography is based on efficiently capturing the light transport between a camera and a projector looking at a scene.

The method is image-based and does not require knowledge about scene geometry or surface properties, while capturing global effects such as mirrored reflections, caustics, diffuse inter-reflections and subsurface scattering.

The appearance of visible objects in an image it depends on the radiance that reaches the observer from each point of that object. From the perspective of geometric optics, objects can emit, reflect, transmit or absorb radiance. Excluding the emission (specific to light sources), the reflected, transmitted and / or absorbed radiance depends, among others, on the radiance incident on that object. To calculate the reflected radiance in a given direction, it is therefore necessary to be able to relate it to the incident radiance.

Primal Configuration Radiance
Figure 1: Incoming radiance / Outgoing radiance

Bidirectional Reflectance Distribution Function (BRDF)

fri, ωr)

ωi: 2D Angle of Incoming Radiance
ωo: 2D Angle of Outgoing Irradiance
n: Normal to Surface

The bidirectional reflectance distribution function (BRDF), fri, ωr), is a function of four real variables that defines how light is reflected at an opaque surface. The function takes an incoming light direction, ωi, and outgoing direction, ωr (taken in a coordinate system where the surface normal n lies along the z-axis), and returns the ratio of reflected radiance exiting along ωr to the irradiance incident on the surface from direction ωi.


BRDF defines that ρ is a function with four degrees of freedom

Primal Configuration Radiance
Figure 2: Four degrees of freedom

Helmholtz Reciprocity

Each ray of light can be reversed without altering its transport properties. Radiance transfer between incoming and outgoing directions is symmetric. Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.

Dual Photography is based on Helmholtz Reciprocity.

Dual photography is a photographic technique that uses Helmholtz reciprocity to capture the light field of all light paths from a structured illumination source to a camera. Image processing software can then be used to reconstruct the scene as it would have been seen from the viewpoint of the projector.

Virtual Configuration Radiance
Figure 3: Incoming radiance / Outgoing radiance

Helmholtz Reciprocity to interchange the lights and cameras in a scene

fri, ωr) = frr, ωi)

ωi: 2D Angle of Outgoing Radiance
ωo: 2D Angle of Incoming Irradiance
n: Normal to Surface

The value of the BRDF remains the same if the directions of incidence and reflection have been changed:

Primal Configuration
Figure 4:
Luminous flux is identical forwards and backwards (Lichtstrom ist vorwärts wie rückwärts identisch)

fri, ωo) = fro, ωi)

This is an important property that allows algorithms that propagate power from light sources and algorithms that collect radiance from the observer's position.

This means that we can effectively exchange the positions of the camera and the projector. We can generate an image from the point of view of the projector.

The Principle of Dual Photography

Pradeep Sen et al. [2005], in hir paper explain the principle of dual photography with reference to the imaging configuration shown in Figure 5 and 6. We have a projector of resolution p × q shining light onto a scene and a camera of resolution m × n capturing the reflected light. Since light transport is linear, the transport from the projector through the scene and into the camera can be expressed as the following simple equation:

c´ = Tp´

The column vector p' is the projected pattern (size pq × 1), and c' (size mn × 1) represents the image captured by the camera. Matrix T (size mn × pq) is the transport matrix that describes how light from each pixel of p' arrives at each pixel of c', taking into account reflections, refractions, and all other light transport paths. For intuition on the composition of T, readers are referred to Figure 3.

Primal Configuration
Figure 5:
c’ : Captured Image (nm x 1)
p’ : Projected Pattern (pq x 1)
T : Transport Matrix (mn x pq)

Helmholtz reciprocity states that the light sources and cameras in a scene can be interchanged without changing either the path taken by the light or the transfer of energy from one to the other. As we show in Appendix A, this means that we can represent the dual of Equation 1 as follows:

p" = TTc"

In this equation the transport matrix T of the scene is the same as before except that we have now transposed it to represent light going from the camera to the projector. We shall refer to Equation 1 as the "primal" equation and Equation 2 as the "dual" equation. In the dual space, p" represents the virtual image that would be visible at the projector if the camera were "projecting" pattern c". We call the process of transposing the transport matrix and multiplying by the desired lighting dual photography. Since the two representations are equivalent, the T matrix can be acquired in either space and then transposed to represent transport in the other space. This is a relatively large matrix, so we develop algorithms to accelerate its acquisition in Section 3. Also note that the two equations are not mathematical inverses of each other (i.e. TTT ≠ I). This is because energy is lost in any real system through absorption or scattering.

Dual Configuration
Figure 6:
c”: Virtual Projected Pattern (nm x 1)
p”: Virtual Captured Image (pq x 1)
TT: Transposed Transport Matrix (pq x nm)

The columns of the T matrix represent the pictures that would be taken at c' when the appropriate pixel at p' is lit. Thus, we can think of T as a concatenation of camera images c'1 through c'pq in column vector form. In a similar way, the columns of TT in the dual space represent images at the projector p" when a single pixel at c" is illuminated.

Primal and Dual Configuration
Figure 7: Makeup of the T matrix

Each element of T describes the transmission in the optical path between a pixel in the projector and a pixel in the camera. Each column of T is the image seen by the camera when only one pixel of the projector is turned on.

Dual Photography is the act of multiplying the transposed matrix by a desired lighting image vector.

Measuring T

To generate the dual image, we must first find the values of t = [t1,...,tpq]. A simple way to do this is to perform a pixel scan with the projector by displaying p x q different patterns each with only one element lit up at a time. We shall refer to this technique as the "brute-force" pixel scan. When projected into the scene, each of these basis patterns extracts a single component of the t vector which will be measured by the photo-resistor as value c'. By putting these measurements back together in the correct order, the t vector can be constructed and used to synthesize an image from the point of view of the projector.

Brute Force Method

The trivial way to construct the T matrix is to turn on one pixel of the projector at a time. Each picture you take is a column of T. This is called the “brute-force” pixel scan. Unfortunately...

  • Need to take as many pictures as there are projector pixels
  • The image when only one pixel is lit can be quite dim
  • Projector and Camera each have O(n6) pixels
  • Full T matrix would have O(n12) elements
  • HDR imagery required for scenes containing both specular and diffuse interreflections.

Even at a rate of 25 HDR images per minute, the capture process could take weeks!

Hierarchical Assembly of the Transport Matrix Function
Figure 8: (a) Conventional photograph of a scene, illuminated by a projector with all its pixels turned on. (b) After measuring the light transport between the projector and the camera using structured illumination, our technique is able to synthesize a photorealistic image from the point of view of the projector. This image has the resolution of the projector and is illuminated by a light source at the position of the camera. The technique can capture subtle illumination effects such as caustics and self-shadowing.

The idea is to use structured light, where you project known patterns with many pixels turned on at once (multiplexing) in such a way that you can easily separate the contributions from each pixel (demultiplexing)..

Initially, to improve performance two ways are proposed:

  • Fixed pattern scanning. Project a set of patterns known in advance.
  • Adaptive multiplexed illumination. Project increasingly finer patterns to resolve detail as needed.

Hierarchical Assembly of the Transport Matrix Function
Figure 9: Measuring T by Brute Force

Measuring T by Brute Force

Size: 5.4 TB

Days: 10.9

We compare it against calculated values for a brute-force pixel scan acquisition, assuming a capture rate of approximately 25 patterns/minute. The data is stored as three 32-bit oats for each matrix element. We can see that our technique is several orders of magnitude more efcient in both time and storage space, although further compression is still possible.

Fixed Pattern Scanning

Assume each projector pixel affects a small, localized region of the camera. Divide the region into blocks. Repeat exposures and encode each block’s illuminated pixels with a unique binary encoding.

Limitations of Fixed Pattern Scanning:

  • Requires one-to-one correspondence between camera and projector pixels: This only supports direct illumination properly.
  • Diffuse Illumination can map many projector pixels to the same camera pixel: This violates the initial assumption.

Adaptive Multiplexed Illumination

The adaptive pattern recursively refines blocks into 4 quadrants. When a block is subdivided, the quadrants are illuminated in sequence to detect conflicts, which happen when the same region in the camera image is illuminated by two sub-blocks. If two sub-blocks have a conflict, they will have to be investigated one after another; otherwise, they can be subdivided in parallel.

Drawbacks of Adaptive Multiplexed Illumination:

  • In scenes where diffuse inter-reflections or subsurface scattering dominates the appearance, each projector pixel could be spread across large areas in the camera image, causing adaptive method to degrade to brute-force.
  • Energy can be lost if the contribution from one projector pixel to one camera pixel falls below the noise threshold. Many entries have low energies, and the sum of their conclusions is significant.

Measuring T efficiently

Hierarchical Assembly of the Transport Matrix

Instead of capturing pixel level matrix T, capture a sequence of matrices Tk at different scales, then upscale and combine them.

Hierarchical Assembly of the Transport Matrix Function

The primal and dual image show diffuse to diffuse inter-reflections which could only be captured by use of the hierarchical acquisition. Energy that might have been lost when further subdividing a block is deposited at a coarse level of the T matrix. To synthesize the dual image, the levels are individually reconstructed by applying the appropriate basis functions, then added together to obtain the image on the left. In this figure the intensity of the images for level 1 to 9 has been increased to visualize their contribution.

Construction of the dual image with a hierarchical representation
Figure 10: Construction of the dual image with a hierarchical representation

Experimentally demonstrated that this works in: O(log pq)

Construction of the dual image with a hierarchical representation result
Figure 11: Construction of the dual image with a hierarchical representation result

The technique presented allows us to efciently capture the transport matrix T of a scene and measure many global illumination effects using only a moderate number of patterns and images. Figure 8 shows two more scenes that were acquired using this hierarchical technique. To show that algorithm accelerates our acquisition and results in a manageable size of the T matrix, we list the relevant data for various scenes in the table below. We compare it against calculated values for a brute-force pixel scan acquisition, assuming a capture rate of approximately 25 patterns/minute. The data is stored as three 32-bit oats for each matrix element. We can see that our technique is several orders of magnitude more efcient in both time and storage space, although further compression is still possible.

To characterize the effect of projector resolution on our hierarchical adaptive algorithm, we plot the number of acquired frames against projector resolution in Figure 9 for the box scene (Fig. 14) and cover scene (Fig. 1). As we increase the resolution exponentially the curves approximate a straight line. This shows that the adaptive multiplexed illumination approach operates in O(log pq) time where pq is the projector resolution.

Construction of the dual image with a hierarchical representation result
Figure 12: Dual image with hierarchical representation result

Measuring T by Hierarchical Assembly of the Transport Matrix

Size: 272 MB

Min: 136

As we increase the resolution exponentially the curves approximate a straight line. This shows that the adaptive multiplexed illumination approach operates in O(log pq) time where pq is the projector resolution.

Measuring T more efficiently

Compressive Sensing

The compression sensor (CS) is a new idea that rethinks data acquisition, a new paradigm in the acquisition and compression of signals that has been attracting the interest of the signal compression community.

The theory of compressed sensing (CS) demonstrates how a subsampled signal can be faithfully reconstructed through non-linear optimization techniques [CRT06, Don06]. Suppose that we represent our continuous signal (the scene, reflectance function, light field, etc.) as an n-dimensional discrete signal x ∈ Rn where n is large. In theory, x can represent any 1-D signal, but for this discussion we assume it to be an n-element scalar reflectance function which has been converted into an n×1 vector (with trivial extension to vectorvalued signals, e.g. RGB transport).We want to estimate this signal by measuring a small number of linear samples ˜x of size k, where kn.We can write ˜x = Sx, where S is a sampling matrix that performs the linear measurements on x. In our application to light-transport acquisition, for example, the goal is to estimate our unknown reflectance function x from k samples.

Sen, Pradeep, and Soheil Darabi. “Compressive dual photography.” Computer Graphics Forum. Vol. 28. No. 2. Blackwell Publishing Ltd, 2009.

Compressive sensing offers a solid mathematical framework to infer a sparse signal from a limited number of nonadaptive measurements. Exploits sparsity to recover images using few random samples. Pieter Peers, et al, in his article, proposes a new structure to capture light transport data from a real scene, based on the compression sensor theory, describing a method that greatly accelerates the Dual Photography process. Compared to the adaptive methods, compressive sensing does not require time consuming computation in real time between captures since it uses a fixed set of patterns.

Consider a k-sparse discrete signal x ∈ Rn that contains at most k ≤ n nonzero elements (i.e., ||x||0 ≤ k). Nowdefine a measurement of this function by taking the dot product of the signal x and a measurement vector φj ∈ Rn : yj = φTj x. We can write m multiple measurements conveniently as a matrix-vector multiplication

y = φT x.

The m × n matrix φT is called a measurement ensemble in the context of compressive sensing. Conventionally (e.g., brute-force sampling), when the measurement ensemble has a rank n, the signal x can be faithfully reconstructed from the measurements, without any a priori knowledge of the signal. This implies that the number of measurements m ≥ n. However, in compressive sensing the a priori knowledge that the signal is k-sparse is used to capture a signal in just O(k) nonadaptive measurements [Donoho 2006; Cand`es 2006].

Peers, Pieter, et al. “Compressive light transport sensing.” ACM Transactions on Graphics (TOG) 28.1 (2009): 3.

Scene Relighting

The transport matrix T between the projector and camera must be acquired to the resolution of the two devices in order to perform dual photography. This means that we also have the information needed to relight the primal and dual images by multiplying T and TT by the desired illumination vectors p' and c" respectively. Knowing T, we can exchange the camera and projector. The image p" “seen” by the projector, given 2D illumination pattern c" from the camera position would be: p" = TTc"

Hierarchical Assembly of the Transport Matrix Function
Figure 13: Scene Relighting

In previous work in relighting (e.g. Masselus et al. [2003]), scenes were relit with incident 4D light fields by acquiring the 6D re- flectance function of the scene. They did this by keeping the camera static with respect to the scene and repositioning the projector while doing measurements. This is equivalent to using a single camera and an array of projectors. Dual photography allows us to acquire this 6D reflectance field in the dual domain with a single projector and an array of cameras, which has two advantages. First, because cameras are passive devices, we can take measurements from each of them in parallel without interference. This can significantly accelerate the acquisition of the reflectance field. Second, there are physical and economic advantages of using a camera array versus a projector array. Projectors are generally heavier, larger, and more costly than cameras. They can also be more difficult to pack densely, align, and calibrate.

Hierarchical Assembly of the Transport Matrix Function
Figure 14: Scenes were relit with incident 4D light fields

Dual Photography with indirect light transport

How to read your opponent’s card?

Hierarchical Assembly of the Transport Matrix Function
Figure 15: How to read your opponent’s card?

A projector illuminates the front of a playing card while the camera sees only the back of the card and the diffuse page of the book. An aperture in front of the projector limits the illumination only onto the card. The card was adjusted so that its specular lobe from the projector did not land on the book. Thus, the only light that reached the camera underwent a diffuse bounce at the card and another at the book.

Indirect light transport
Figure 16: Configuration

It shows the deck from the perspective of the projector being indirectly illuminated by the camera. So here we discover the image that the project is seeing!

Indirect light transport result
Figure 17: The resulting image

The resulting image has been automatically antialiased over the area of each projector pixel. Sample images acquired when the projector scanned the indicated points on the card. The dark level has been subtracted and the images gamma corrected to amplify the contrast. We see that the diffuse reflection changes depending on the color of the card at the point of illumination. After acquiring the T matrix in this manner, we can reconstruct the floodlit dual image.

Indirect light transport contrast
Figure 18: After acquiring the T matrix in this manner, we can reconstruct the floodlit dual image

References

Sen, Pradeep, et al. “Dual Photography”. To Appear in the Proceedings of SIGGRAPH 2005.

Schulz, Adriana and Luiz Velho and Eduardo A. B. da Silva. “Compressive Sensing. On the Empirical Rate-Distortion Performance of Compressive Sensing”. Presented at IEEE International Conference on Image Processing (ICIP), November 2009. Available in: http://w3.impa.br/~aschulz/CS/paper.html

Schulz, Adriana and Luiz Velho and Eduardo A. B. da Silva. “Compressive Sensing”. Publicações Matemáticas, Presented at 27o Colóquio Brasileiro de Matemática, 2009. Available in: https://impa.br/wp-content/uploads/2017/04/PM_31.pdf

Schulz, Adriana and Luiz Velho and Eduardo A. B. da Silva. “Compressive Sensing: New Paradigms for Image Aquisition and Compression”. To be taught at 27o Colóquio Brasileiro de Matemática, July 2009. Available in: http://w3.impa.br/~aschulz/CS/course.html

Sen, Pradeep, and Soheil Darabi. “Compressive dual photography.” Computer Graphics Forum. Vol. 28. No. 2. Blackwell Publishing Ltd, 2009.

Peers, Pieter, et al. “Compressive light transport sensing.” ACM Transactions on Graphics (TOG) 28.1 (2009): 3.