Coded Two-Bucket Cameras for Computer Vision
Mian Wei, Navid Sarhangnejad, Zhengfan Xia, Nikita Gusev, Nikola Katic, Roman Genov, and Kiriakos N. Kutulakos. ECCV 2018. (Oral)
We introduce coded two-bucket (C2B) imaging, a new operating principle for computational sensors with applications in active 3D shape estimation and coded-exposure imaging. A C2B sensor modulates the light arriving at each pixel by controlling which of the pixel’s two "buckets" should integrate it. C2B sensors output two images per video frame—one per bucket—and allow rapid, fully-programmable, per-pixel control of the active bucket. Using these properties as a starting point, we (1) develop an image formation model for these sensors, (2) couple them with programmable light sources to acquire illumination mosaics, i.e., images of a scene under many different illumination conditions whose pixels have been multiplexed and acquired in one shot, and (3) show how to process illumination mosaics to acquire live disparity or normal maps of dynamic scenes at the sensor’s native resolution. We present the first experimental demonstration of these capabilities, using a fully-functional C2B camera prototype. Key to this unique prototype is a novel programmable CMOS sensor that we designed from the ground up, fabricated and turned into a working system.
Programmable per-pixel masking
The C2B camera enables light-efficient coded-exposure imaging on a more compact camera platform with per-pixel masking fidelity. Each pixel has a one-bit memory that allows coded exposure to be performed electronically on chip, circumventing the need for precisely aligned relay optics and physical masks.
Each pixel has two buckets for accumulating light. This ensures that light is never lost as one of two buckets is always actively accumulating light. This opens up the potential for a new range of applications that go well beyond what is possible today.
Capturing multiple images in a single shot
We can program C2B cameras to capture a two-bucket illumination mosaic, a pair of images (one for each bucket), that can be processed to obtain multiple images at full resolution for each video frame. This enables the use of any computer vision algorithm that rely on multiple images of a static scene to be used for dynamic scenes.
We extend two methods, traditionally limited to static scenes, to dynamic scenes using the illumination mosaic - surface normal estimation (using photometric stereo) and disparity estimation (using structured light triangulation).
Optimal two-bucket multiplexing
We develop a two-bucket multiplexing theory underpinning how the C2B camera should be programmed. The theory relates a sequence of desired images to a sequence of bucket images captured with the C2B camera through a binary matrix, called a multiplexing matrix.
Using our theory, we derive an objective function that can be minimized to search for multiplexing matrices that maximize the SNR of the desired images. Furthermore, under some conditions, we also show provably optimal matrix can be computed in closed form.
Capturing image ratios
The C2B camera offers another imaging modality. Instead of outputting image intensities, the two bucket images can be used to compute image ratios which are invariant to albedo in the scene. We derive an approximate noise model for the image ratios as well as an extension of the two-bucket multiplexing theory.