Indoor Nano Drones: Integrating a VIO capable MIPI-CSI2 camera on the iMX93

Introduction

In general, when robot systems require a camera to sense the environment, an engineer usually will purchase an off-the-shelf machine vision camera and plug it directly into the robot’s main computer via ethernet or USB.  While this is often the simplest approach, it’s expensive ($100s), heavy (100s of grams), bulky, and more CPU intensive (e.g. image post-processing must be done in software).  For size, weight, power, or cost constrained applications, machine vision cameras are non-starter.  Instead, compact, fully integrated camera modules can be used to communicate directly over a MIPI CSI-2 communication bus with a processor’s image processing hardware subsystem.  At Virtana, for our autonomous indoor nano-drone built on top of the CrazyFlie 2.1 platform, we plan to integrate a MIPI CSI-2 OV9281/2 camera module (e.g. from Sincere or Arducam, for example) with a Compulab UCM-iMX93 SoM[1] to perform visual-inertial-odometry for pose estimation and navigation [2].  These hardware choices enable the ultralight camera module (weighing only a few grams) to communicate directly over MIPI CSI-2 with the iMX93’s ISI image hardware subsystem, thus reducing CPU loading and latency, while still adhering to the CrazyFlie’s limited payload weight capacity.  The rest of this article describes how we converged on the OV9281/2 sensor family and the technical hurdles and various fixes needed to successfully prototype this hardware system (an Arducam OV9281 camera development board and Compulab’s UCM-iMX93 evaluation kit (EVK).

[1] - How we selected this SoM can be found in our paper “Selecting a modern lightweight compute platform for carrying out VIO and vSLAM on a nano-drone” which was submitted to the University of the West Indies Five Islands AI Conference (2024). 

[2] - Our results showing its suitability for running VIO algorithms on this SoM can be found in a previous blog post.

How we selected the Arducam OV9281 development board

Given our focus on VIO, we wanted a camera with the following features:

  • Monochrome - Lack of a Bayer Filter produces sharper edges and better feature detection

  • Global Shutter - To avoid rolling shutter artifacts during aggressive drone movements

  • Large Field of View - Higher overlap between frames allows tracking features more reliably during aggressive movements

  • ~1 Megapixel - High enough resolution to resolve fine features, but not so excessively high that it saturates the processor & MIPI bus.

Several camera sensors were considered, but we eventually settled on using Omnivision’s OV9282 since it met all of the previously discussed criteria (global shutter, monochrome, resolutions of 720p, 800p available, examples with 150° FoV). Additionally, there is already evidence of this sensor successfully being used with a VIO system: Luxonis’ OAK-D stereo camera has successfully been used for VIO and VSLAM, and the Intel T265 (specially designed for motion tracking).

Getting MIPI CSI-2 cameras working is often not a straightforward process, especially when using platforms that do not have a large community behind them and out of the box support for various cameras (E.g. Raspberry Pi or NVIDIA Jetson).  As such, we decided to de-risk camera software development first by working with a pre-built MIPI-CSI2 camera board which contains the relevant circuitry to operate the camera. 

Using a pre-built camera module with available schematics also provides us with a working example for the camera circuitry that will be needed for our custom camera board. This is important to us because troubleshooting high-frequency CSI-2 circuitry requires expensive instruments which are largely inaccessible to low-budget projects.  

Ideally, we would choose Luxonis's OV9282 module since their schematics are publicly available. However, this camera does not have connectors that are compatible with the UCM-iMX93 evaluation kit so both hardware (an adapter) and software prototyping would be simultaneously required. We decided to start with Arducam's Raspberry Pi-compatible OV9281 module instead (from a software perspective the OV9281/2 are identical). This allows us to focus on getting the software right, before eventually moving to CSI-2 hardware development.

The MIPI-CSI 2, iMX93 & V4L2 Camera Stack

Figure 1 shows a high level overview of the modules within the camera stack being used on the iMX93 (adapted from this thread).

Figure 1: High level block diagram of the camera stack on the iMX93 (created by Enrique Ramkissoon)

The CSI peripheral is responsible for receiving the raw image data sent from the camera sensor. This stream of raw data is received by the Image Pixel Interface (IPI) which converts the raw CSI-2 packets from bytes to pixels. The pixel bus produced by the IPI is received by the Image Sensor Interface (ISI) which performs various image processing operations on the image data.

V4L2 subdevices are created for the ISI, CSI and camera. The iMX93 uses NXP's media device driver (imx8-media-dev) to create links between V4L2 and the various hardware drivers present on the system. Figure 2 shows the subdevice links created during initialization.

Figure 2: V4L2 sink and source pad overview (generated using “media-ctl --print-dot”)

iMX93 Linux Device Tree Changes

The base device tree for the UCM-iMX93 defines the MIPI-CSI2 node for use with the e-CAM131_CURB. The node needed to be modified for work with our particular camera and an I2C child node was added for the Arducam OV9281 camera. This section explains these changes. See this commit for the changes in context.

I2C Configuration

The OV9281 is programmable via an I2C interface for configuration. A device tree node that could be used by the ov9282.c driver needed to be added. The Linux kernel has an example device tree node included for the OV9282 sensor which was used as a guide, as well as the Raspberry Pi OV9281 node. The below shows our custom node.


&lpi2c3 {
    #address-cells = <1>;
    #size-cells = <0>;
    sensor_ov9281: camera@60 {
        compatible = "ovti,ov9281";
        reg = <0x60>;
        clocks = <&clk IMX93_CLK_MIPI_PHY_CFG>;
        reset-gpios = <&pca9555 11 GPIO_ACTIVE_HIGH>;
        status = "okay";
        port {
            ov9281_ep: endpoint {
                remote-endpoint = <&mipi_csi_ep>;
                data-lanes = <1 2>;
                clock-lanes = <0>;
                link-frequencies = /bits/ 64 <400000000>;
                clock-noncontinuous;
            };
        };
    };
};

The I2C client address for the OV9281 is 0x60. 

The reset-gpios property references the GPIO connected to the reset pin on the camera. On the OV9281, this pin refers to XSHUTDOWN. XSHUTDOWN is active low, so a high signal is needed to keep the camera on. . pca9555 references the GPIO expander on the evaluation kit (schematics and other documentation can be found here) which is connected to the 22-pin Raspberry Pi connector. 

The OV9281 has two data lanes and a single clock lane. These are specified by the data-lanes and clock-lanes properties. The ov9282 driver configures the camera's link frequency to 400MHz.

Because our version of our OV9282 driver did not support a continuous clock configuration (it comes before this commit) and we want to minimize power consumption, we configured the camera's clock to operate in non-continuous mode using the clock-noncontinuous parameter. 
The OV9282 driver device  tree match table was also updated to recognize this node, as follows (We are not using the newest version of the OV9282 driver. Newer versions already have this change):


static const struct of_device_id ov9282_of_match[] = {
	{ .compatible = "ovti,ov9281" },
	{ .compatible = "ovti,ov9282" },
	{ }
};

Enabling pre-existing nodes

The following Device Tree nodes needed to be enabled for MIPI-CSI2 camera support. 

  • cameradev - used by imx8-media-dev driver (code ref), which is the V4L2 media controller driver. This is responsible for setting up the V4L2 CSI2 sub-device. 

  • isi_0 - used by the imx8-isi-core driver (code ref). The Image Sensing Interface (ISI) of the iMX93 is responsible for interfacing with a pixel source to get image data for processing within its pipeline. This driver is responsible for configuring the ISI.

  • cap_device - used by the imx8-isi-cap driver (code ref). This is responsible for managing functionality related to capturing images (E.g. managing V4L2 buffers)

MIPI-CSI2 Device Node Changes

The below snippet shows the mipi_csi node definition in the base device tree for the UCM-iMX93.


&mipi_csi {
        #address-cells = <1>;
        #size-cells = <0>;
        status = "okay";

        port@0 {
                reg = <0>;
                mipi_csi_ep: endpoint {
                        remote-endpoint = <&ar1335_mipi_ep>;
                        data-lanes = <2>;
                        cfg-clk-range = <28>;
                        hs-clk-range = <0x2b>;
                        bus-type = <4>;
                };
        };
};

The remote-endpoint property needed to be modified to point to our new I2C camera sensor node. 

Based on Table 558 of the section on DDL Tuning in the iMX93 Reference Manual, the hs-clk-range property needs to be set based on the desired operating link frequency. The default hex value corresponds to a frequency of 1.3GHz, while the value corresponding to our desired value of 400MHz is 0x05. Therefore the property value needed to be modified. 

cfg-clk-range remains the same for the Arducam OV9281. This value is defined in Section 55.3.1 of the Reference Manual by:

 cfgclkfreqrange[5:0] = round[(Fcfg_clk(MHz) - 17)* 4] 

Our camera input clock frequency is 24MHz which produces a value of 28. This will need to change if your camera frequency is different. 

clock-noncontinuous was added for the same reasons mentioned for the camera’s I2C node. 

The below shows the final changes we made to the mipi_csi node for our device tree.


&mipi_csi {
	port@0 {
		mipi_csi_ep: endpoint {
			remote-endpoint = <&ov9281_ep>;
			hs-clk-range = <0x05>;
			clock-noncontinuous;
		};
	};
};

Physically Connecting the SoM’s Power_Enable to the Camera

The first problem encountered with integrating the Arducam OV9281 with the iMX93 evaluation kit was that the camera would not acknowledge I2C commands (using the i2cdetect tool). 

The Compulab iMX93 evaluation kit uses a standard Raspberry Pi 22-pin CSI-2 connector (See Figure 3).

Figure 3: 22-pin Raspberry Pi connector on the UCM-iMX93 EVK and unpopulated resistor (schematics can be found on the product page; picture taken by Vijay Pradeep)

Pin 17 of this connector (CSI1_B_TRG), commonly referred to as POWER_ENABLE, is typically used to turn the sensor on and off. This is routed to an I2C GPIO expander IC on the iMX93 evaluation board. R102 (0Ω resistor) is listed as DNP (Do Not Populate) in the schematic and was therefore not populated on our evaluation board (See Figure 3). Probing the pin revealed that POWER_ENABLE was also not being pulled up by the camera carrier board.

Connecting the component pads with a solder bridge allowed the sensor to turn on and begin acknowledging I2C messages. This is shown in Figure 4 below.

Figure 4: Solder pads connected to connect POWER_ENABLE to the GPIO expander pin on the Compulab iMX93 evaluation board (picture taken by Sarika Ramroop)

Adding missing media entity operations to the camera driver

The ov9282 driver packaged with the Linux kernel initializes a media_entity struct (See Media Controller Entities), embedded into its instance of v4l2_subdev. The Linux Media Controller documentation states that media entity operations are optional. However, the iMX ISI driver requires that the link_setup operation be defined. If this operation is not defined, the [ov9282 2-0060] => [mxc-mipi-csi2.0] link creation fails with the following error:


mxc-mipi-csi2.0: is_entity_link_setup, No remote pad found!

To fix this, an empty link_setup function was created and added to the camera subdevice's media entity (see commit):


static int ov9282_link_setup(struct media_entity *entity,
			   const struct media_pad *local,
			   const struct media_pad *remote, u32 flags)
{
	return 0;
}

static const struct media_entity_operations ov9282_sd_media_ops = {
	.link_setup = ov9282_link_setup,
};
ov9282->sd.entity.ops = &ov9282_sd_media_ops;

Other platforms may not require this operation to be defined in the camera driver.

Adding s_power subdevice operation in the camera driver

The OV9282 driver packaged with the Linux kernel creates the following subdevice operations by default:


/* V4l2 subdevice ops */
static const struct v4l2_subdev_video_ops ov9282_video_ops = {
	.s_stream = ov9282_set_stream,
};

static const struct v4l2_subdev_pad_ops ov9282_pad_ops = {
	.init_cfg = ov9282_init_pad_cfg,
	.enum_mbus_code = ov9282_enum_mbus_code,
	.enum_frame_size = ov9282_enum_frame_size,
	.get_fmt = ov9282_get_pad_format,
	.set_fmt = ov9282_set_pad_format,
};

static const struct v4l2_subdev_ops ov9282_subdev_ops = {
	.video = &ov9282_video_ops,
	.pad = &ov9282_pad_ops,
};

When streaming is enabled, the iMX ISI driver attempts to use the core.s_power subdevice operation (See the function mxc_isi_cap_streamon). Since this is not defined by default in the sensor driver, the following errors are generated when attempting to capture an image:


mxc_isi.0: Call subdev s_power fail!		                       In kernel logs

VIDIOC_STREAMON returned -1 (Inappropriate ioctl for device)  	In V4L2 Userspace

The s_power operation is typically used to place the sensor in either a power saving or normal mode of operation. Since this functionality is unimplemented in our camera driver, we can simply define an empty operation to allow the ISI driver to run without error (see commit):


static int ov9282_s_power(struct v4l2_subdev *sd, int on)
{
	return 0;
}

static struct v4l2_subdev_core_ops ov9282_subdev_core_ops = {
	.s_power	= ov9282_s_power,
};

static const struct v4l2_subdev_ops ov9282_subdev_ops = {
	.core = &ov9282_subdev_core_ops,
	.video = &ov9282_video_ops,
	.pad = &ov9282_pad_ops,
};

Other platforms may not require this operation to be defined in the camera driver.

Debugging and Fixing Camera Initialization Timing Issue

The Linux Kernel's OV9282 driver defines the s_stream video operation which is called by V4L2 to start camera streaming. Before the camera starts streaming, the driver configures the camera by writing values to a series of registers on the camera sensor over the I2C bus.

Figure 5 shows the first attempted I2C write to the camera during initialization. This register write fails because the camera does not acknowledge the message. The OV9281 driver then abandons the camera initialization, causing V4L2 to generate a fatal error.

Figure 5: Failed I2C write when streaming is enabled on the OV9281 (graph generated with Saleae Logic 8 Logic Analyzer)

After some probing, we discovered that the driver was attempting to communicate with the sensor, before the sensor had enough time to power on.

The OV9281 sensor contains an active-low input pin called XSHUTDOWN which is used to power down the sensor. This pin must be pulled high to power up the sensor. While we do not have schematics available to confirm (Arducam requested a $1000 deposit before sharing any schematics), we suspect that this pin is connected to the POWER_ENABLE pin on the CSI connector, through some circuitry that adds a time delay.

The OV9281 sensor requires 3 supply voltages:

  1. Analog Supply Voltage (AVDD - 2.8V), 

  2. Digital (Core) Supply Voltage (DVDD - 1.2V)

  3. IO Supply Voltage (DOVDD - 1.8V)

Each of these supply voltages are generated by voltage regulators on the Arducam carrier board; these are shown in Figure 6.  While we have yet to find the exact part specification for each of these voltage regulators, the logo is consistent with Microne (Nanjing Micro One Elec), and the LCSC listing for Microne ME6211C50M5G-N is a close enough match to make an educated guess for pinouts and functionality (e.g. voltage in, voltage out, current enable, etc).

Measuring the output voltages allows us to map each regulator to the corresponding OV9281 supply voltage.

Figure 6: Voltage regulators on the Arducam OV9281 carrier board (picture taken by Vijay Pradeep)

The OV9281 requires these supply voltages and XSHUTDOWN to be pulled up in the following order: 

  1. AVDD and DOVDD needs to be pulled up first. The order does not matter

  2. DVDD needs to be pulled up after DOVDD

  3. XSHUTDOWN needs to be pulled up after DVDD, and at least 1ms after both AVDD and DOVDD are pulled up.

  4. The sensor is ready to communicate via I2C 8192 reference clock cycles after XSHUTDOWN is pulled up.

Figure 7 shows the timing of each regulator probed on the Arducam PCB when we attempted to capture an image (DVDD is inverted because of a level-shifting circuit between the regulator output and our logic analyzer).

Figure 7: Timing of camera signals when streaming is enabled on the OV9281 (generated with a Saleae Logic 8 Logic Analyzer and Saleae Logic 2 Software)

Since XSHUTDOWN needs to be pulled high 1ms after AVDD is high, we can assume that XSHUTDOWN is actually being pulled up after the first I2C message is sent!

For the sensor's timing constraints to be met, XSHUTDOWN needs to be pulled high at least 8192 cycles of the 24MHz reference clock (341μs) before the first I2C message is sent.

To fix this, we added a 1ms delay to the camera driver, before I2C writes took place (see this commit), which seems to be enough time to allow the sensor to complete its power up sequence. With some more investigation, we can figure out a point on the camera carrier board to probe XSHUTDOWN, and determine the exact delay that is needed.

Modifying ISI and IPI driver to accept RAW10 sensor data

While the ISI hardware peripheral on the iMX93 supports RAW pixel formats, the ISI driver does not. Minor changes were made to the ISI and IPI drivers to allow them to receive and output 10-bit RAW data from the sensor.

The ISI gasket facilitates communication between the MIPI CSI2 peripheral and the IPI. For our use case, the gasket driver needed to be configured to use the "MEDIA_BUS_FMT_Y10_1X10" media bus code specified by the camera driver.

By default, the IPI driver configures the IPI hardware for interpreting data received from the sensor as either YUV422 or RGB888. A small modification to the driver was required to allow the IPI to receive 10-bit RAW data.


ipi_cfg->data_type  = DT_RAW10;
ipi_cfg->vir_chan   = 0;
ipi_cfg->hsa_time   = 0;
ipi_cfg->hbp_time   = 0;
ipi_cfg->hsd_time   = 0;
ipi_cfg->hline_time = 0x500;
ipi_cfg->vsa_lines  = 0;
ipi_cfg->vbp_lines  = 0;
ipi_cfg->vfp_lines  = 0;
ipi_cfg->vactive_lines   = 0x320;
ipi_cfg->controller_mode = 0;
ipi_cfg->color_mode_16   = 1;
ipi_cfg->embeded_data    = 0;

The iMX93 IPI allows using either a 16-bit or 48-bit interface for delivering color components to the ISI. In the driver, this is set using the ipi_cfg->color_mode_16 parameter, which is written to the IPI_MODE register (See Section 54.2.6.1.2 - Interface Type in the iMX93 Reference Manual). 

The reference manual says that both interface types are compatible with RAW10 data. However, when we configure the IPI to use the 48-bit mode, it seems that our image output contains only every 3rd pixel. We currently do not know why data is being lost in the 48-bit interface mode but this will be an area we plan to investigate further in the future.

imx-isi-fmt.c defines the ISI output formats that can be used. A new format for RAW10 needed to be added. This format is selectable from the V4L2 userspace application.


{
	.name		= "RAW_Y10",
	.fourcc	= V4L2_PIX_FMT_Y10,
	.depth		= { 16 },
	.color		= MXC_ISI_OUT_FMT_RAW16,
	.memplanes	= 1,
	.colplanes	= 1,
	.align		= 2,
	.mbus_code	= MEDIA_BUS_FMT_Y10_1X10,
}

See this commit for these changes in context.

Capturing and decoding an image on the iMX93

V4L2 provides an application, v4l2-ctl, which can be used to capture images from connected cameras. Once the above changes were made, an image was captured using the following command:

v4l2-ctl -d /dev/video0 --set-fmt-video=width=1280,height=720,pixelformat='Y10 ' --stream-mmap 1 --stream-count=1 --stream-to=test.raw --verbose

The pixelformat  parameter is used to select the ISI output format. The string used to specify the format is defined by the "fourcc"  (four-character code) member of each format.

This command generates a file, test.raw, which contains raw binary pixel data, encoded in 10-bit RAW. This file can be decoded into a readable image using ffmpeg:


ffmpeg -y -s:v 1280x720 -pix_fmt gray10le -i test.raw out.png

Next Steps

So far, this is the first part of our work to integrate live sensor data for VIO on the UCM-i.MX93. The next major bodies of work include the following:

  1. Achieving synchronized hardware timestamping between the IMU and camera 

    1. VIO requires timestamps to be accurately captured in the same time domain to associate the data between the different sensors

  2. Creating a custom camera carrier board that is better suited for being mounted to a nano drone. This will be an incremental process that will involve moving from the Arducam OV9281 module to Luxonis’ OV9282 module as Luxonis provides access to schematic and PCB design files.

We are hopeful about making contributions to the Linux kernel or NXP drivers as we continue our development.

Acknowledgements

This work would not have been possible without contributions from Andre Thomas and Nicholas Chamansingh, who helped shape the overall project goals and continue to be instrumental in planning and project direction.

Next
Next

Indoor Nano-Drones: Visual Inertial Odometry on an ultralight ARM module