What Is Multi Camera Production

Getting the Most from the New Multi-Camera API

This blog mail service complements our Android Developer Summit 2018 talk, done in collaboration with Vinit Modi, the Android Camera PM, and Emilie Roberts, from the Partner Developer Relations team. Check out our previous weblog posts in the serial including camera enumeration, camera capture sessions and requests and using multiple photographic camera streams simultaneously.

Multi-photographic camera use-cases

Multi-camera was introduced with Android Pie, and since launch a few months ago we are now seeing devices coming to market that support the API like the Google Pixel iii and Huawei Mate 20 series. Many multi-camera apply-cases are tightly coupled with a specific hardware configuration; in other words, not all utilise-cases will be uniform with every device — which makes multi-photographic camera features a great candidate for dynamic delivery of modules. Some typical use-cases include:

Zoom: switching between cameras depending on crop region or desired focal length
Depth: using multiple cameras to build a depth map
Bokeh: using inferred depth data to simulate a DSLR-similar narrow focus range

Logical and physical cameras

To understand the multi-camera API, we must first sympathize the departure betwixt logical and physical cameras; the concept is best illustrated with an example. For example, we can retrieve of a device with three back-facing cameras and no front end-facing cameras as a reference. In this example, each of the 3 back cameras is considered a physical photographic camera. A logical photographic camera is then a grouping of two or more of those physical cameras. The output of the logical camera can exist a stream that comes from one of the underlying physical cameras, or a fused stream coming from multiple underlying concrete cameras simultaneously; either fashion that is handled by the camera HAL.

Many telephone manufacturers also develop their get-go-party camera applications (which normally come pre-installed on their devices). To apply all of the hardware's capabilities, they sometimes made utilize of private or hidden APIs or received special treatment from the driver implementation that other applications did not have privileged access to. Some devices even implemented the concept of logical cameras by providing a fused stream of frames from the different physical cameras but, again, this was just available to sure privileged applications. Often, simply one of the physical cameras would be exposed to the framework. The situation for 3rd party developers prior to Android Pie is illustrated in the following diagram:

Camera capabilities typically only available to privileged applications

Beginning in Android Pie, a few things have inverse. For starters, private APIs are no longer OK to use in Android apps. Secondly, with the inclusion of multi-photographic camera support in the framework, Android has been strongly recommending that phone manufacturers expose a logical camera for all physical cameras facing the same direction. As a result, this is what third party developers should look to see on devices running Android Pie and to a higher place:

Full developer access to all camera devices starting in Android P

It is worth noting that what the logical camera provides is entirely dependent on the OEM implementation of the Camera HAL. For example, a device like Pixel 3 implements its logical camera in such a mode that it will cull one of its physical cameras based on the requested focal length and ingather region.

The multi-camera API

The new API consists in the improver of the following new constants, classes and methods:

CameraMetadata.REQUEST_AVAILABLE_CAPABILITIES_LOGICAL_MULTI_CAMERA
CameraCharacteristics.getPhysicalCameraIds()
CameraCharacteristics.getAvailablePhysicalCameraRequestKeys()
CameraDevice.createCaptureSession(SessionConfiguration config)
CameraCharactersitics.LOGICAL_MULTI_CAMERA_SENSOR_SYNC_TYPE
OutputConfiguration & SessionConfiguration

Thanks to changes to the Android CDD, the multi-camera API as well comes with sure expectations from developers. Devices with dual cameras existed prior to Android Pie, but opening more one camera simultaneously involved trial and error; multi-camera on Android now gives united states a set of rules that tell us when we tin can open up a pair of concrete cameras as long equally they are function of the same logical camera.

As stated above, we can look that, in most cases, new devices launching with Android Pie will expose all concrete cameras (the exception being more exotic sensor types such as infrared) along with an easier to use logical camera. As well, and very crucially, nosotros can expect that for every combination of streams that are guaranteed to piece of work, one stream belonging to a logical photographic camera tin can exist replaced by ii streams from the underlying concrete cameras. Let'due south encompass that in more than particular with an instance.

Multiple streams simultaneously

In our terminal weblog post, we covered extensively the rules for using multiple streams simultaneously in a single camera. The exact aforementioned rules use for multiple cameras with a notable addition explained in the documentation:

For each guaranteed stream combination, the logical camera supports replacing 1 logical YUV_420_888 or raw stream with two physical streams of the same size and format, each from a separate physical camera, given that the size and format are supported by both physical cameras.

In other words, each stream of type YUV or RAW tin be replaced with two streams of identical type and size. So, for case, we could start with a camera stream of the following guaranteed configuration for single-photographic camera devices:

Stream 1: YUV type, MAXIMUM size from logical camera `id = 0`

Then, a device with multi-camera back up will let usa to create a session replacing that logical YUV stream with two concrete streams:

Stream one: YUV type, MAXIMUM size from physical photographic camera `id = i`
Stream two: YUV blazon, MAXIMUM size from physical camera `id = two`

The play tricks is that we can replace a YUV or RAW stream with two equivalent streams if and merely if those two cameras are part of a logical camera grouping — i.east. listed under CameraCharacteristics.getPhysicalCameraIds().

Another thing to consider is that the guarantees provided by the framework are simply the bare minimum required to get frames from more than one physical camera simultaneously. We tin can expect for additional streams to exist supported in nigh devices, sometimes even letting united states open multiple physical camera devices independently. Unfortunately, since it's not a hard guarantee from the framework, doing that will require us to perform per-device testing and tuning via trial and fault.

Creating a session with multiple physical cameras

When we interact with physical cameras in a multi-camera enabled device, we should open up a single CameraDevice (the logical camera) and interact with it within a unmarried session, which must be created using the API CameraDevice.createCaptureSession(SessionConfiguration config) available since SDK level 28. Then, the session configuration volition accept a number of output configurations, each of which will have a ready of output targets and, optionally, a desired physical camera ID.

SessionConfiguration and OutputConfiguration model

Subsequently, when nosotros acceleration a capture request, said asking volition have an output target associated with information technology. The framework will determine which physical (or logical) camera the request will exist sent to based on what output target is fastened to the request. If the output target corresponds to one of the output targets that was sent equally an output configuration along with a physical camera ID, then that physical camera will receive and procedure the asking.

Using a pair of physical cameras

One of the nigh of import developer-facing additions to the camera APIs for multi-photographic camera is the ability to place logical cameras and finding the physical cameras behind them. Now that nosotros empathize that we can open physical cameras simultaneously (once more, by opening the logical camera and every bit part of the same session) and the rules for combining streams are clear, nosotros can ascertain a function to help us identify potential pairs of concrete cameras that can exist used to replace i of the logical camera streams:

Land handling of the physical cameras is controlled by the logical camera. So, to open our "dual camera" we just demand to open the logical photographic camera respective to the concrete cameras that we are interested in:

Upward until this betoken, besides selecting which camera to open, cipher is unlike compared to what nosotros have been doing to open up any other camera in the past. Now it'southward time to create a capture session using the new session configuration API so nosotros can tell the framework to associate certain targets with specific physical camera IDs:

At this betoken, we tin refer back to the documentation or our previous weblog post to understand which combinations of streams are supported. We just need to recollect that those are for multiple streams on a single logical camera, and that the compatibility extends to using the aforementioned configuration and replacing ane of those streams with two streams from two concrete cameras that are office of the same logical camera.

With the camera session ready, all that is left to practice is dispatching our desired capture requests. Each target of the capture request will receive its data from its associated concrete camera, if any, or autumn back to the logical camera.

Zoom instance use-case

To tie all of that dorsum to ane of the initially discussed use-cases, let's see how we could implement a feature in our camera app and then that users can switch betwixt the different physical cameras to experience a different field-of-view — effectively capturing a different "zoom level".

Example of swapping cameras for zoom level use-example (from Pixel 3 Advert)

Offset, we must select the pair of physical cameras that we desire to let users to switch between. For maximum effect, we tin can search for the pair of cameras that provide the minimum and maximum focal length available, respectively. That way, we select one camera device able to focus on the shortest possible altitude and another that can focus at the furthest possible point:

A sensible architecture for this would be to have two SurfaceViews, one for each stream, that get swapped upon user interaction and then that but one is visible at whatsoever given time. In the post-obit code snippet, we demonstrate how to open up the logical camera, configure the camera outputs, create a camera session and start two preview streams; leveraging the functions defined previously:

Now all we need to practise is provide a UI for the user to switch between the two surfaces, like a button or double-tapping the SurfaceView; if we wanted to get fancy we could try performing some grade of scene analysis and switch between the two streams automatically.

Lens distortion

All lenses produce a certain amount of baloney. In Android, we can query the baloney created by lenses using CameraCharacteristics.LENS_DISTORTION (which replaces the now-deprecated CameraCharacteristics.LENS_RADIAL_DISTORTION). For logical cameras, it is reasonable to expect that the distortion will be minimal and our application can utilize the frames more-or-less as they come from the photographic camera. However, for physical cameras, we should expect potentially very different lens configurations — specially on wide lenses.

Some devices may implement automatic baloney correction via CaptureRequest.DISTORTION_CORRECTION_MODE. Information technology is good to know that baloney correction defaults to being on for most devices.The documentation has some more detailed data:

FAST/HIGH_QUALITY both mean camera device determined distortion correction will exist applied. HIGH_QUALITY way indicates that the camera device will use the highest-quality correction algorithms, even if it slows downwardly capture rate. FAST ways the camera device will not slow down capture charge per unit when applying correction. FAST may be the same every bit OFF if any correction at all would dull down capture charge per unit […] The correction only applies to processed outputs such every bit YUV, JPEG, or DEPTH16 […] This control will exist on by default on devices that support this control.

If nosotros wanted to take a still shot from a concrete using the highest possible quality, then we should try to set correction mode to HIGH_QUALITY if it's available. Here'due south how nosotros should be setting up our capture asking:

Keep in listen that setting a capture asking in this manner volition accept a potential bear upon on the frame rate that can be produced by the camera, which is why we are merely setting the distortion correction in nevertheless paradigm captures.