You are here

Lidar-like 3D Imaging System for Accurate Scene Understanding

Description:

OUSD (R&E) MODERNIZATION PRIORITY: Artificial Intelligence (AI)/Machine Learning (ML);Autonomy

 

TECHNOLOGY AREA(S): Information Systems;Sensors

 

OBJECTIVE: Develop inexpensive Lidar-like 3D imaging sensors that have high depth and lateral resolution, have a large field-of view for reliable object detection, respond in real time, and work at medium to long ranges in indoor and outdoor environments.

 

DESCRIPTION: 3D scene understanding (i.e., 3D scene segmentation and parsing, depth estimation, object detection and recognition) are essential components in a vision system. 3D sensors similar to Microsoft Kinect are inexpensive and high resolution but have limited range outdoors, thus not suited for many robotics applications. Lidars have long range and high depth accuracy, but are very expensive; for example, those used in self-driving cars are typically several times more expensive than other car components. Another drawback of current Lidars is their small “vertical” field-of-view, which results in limited vertical resolution and accuracy in object detection because Lidars (even the more expensive ones) have at most 64 scan lines, which could fail to detect small objects even at medium range distances.

 

The goal of this STTR topic is to develop inexpensive, high-resolution, high-accuracy 3D imaging sensors for wide use on a variety of large and small ground and aerial robotic platforms that can work in dynamic environments under different conditions. ONR expects recent promising advances along a number of directions including machine learning-based algorithms for improved depth estimation with stereo cameras [Refs 2, 5], active illumination technologies [Ref 1], and optimal time-of-flight coding [Ref 3], etc., open new approaches to building hybrid systems that combine optical cameras and laser ranging for developing such 3D imaging sensors. Combining these advances (ML-based stereo imaging, utilizing active illumination for 3D imaging, and novel time-of-flight coding for improved range estimation) requires innovative approaches.

 

PHASE I: Design the system architecture including sensors and computing hardware, and processing and inference algorithms for building inexpensive, high-resolution, accurate, 3D imaging sensors. Since these sensors are intended for use on various UGVs and UAVs deployed in dynamic and cluttered environments, the design should consider tradeoff estimates among size, weight, and power (SWAP), as well as resolution, detection accuracy, operating range, frame rate, and cost. Develop a breadboard version to demonstrate the feasibility of the design. Develop a Phase II plan.

 

PHASE II: Perform experiments in a variety of situations and refine the system. Goals for Phase II are: (a) the system should have a field-of-view and resolution comparable to optical cameras; (b) Demonstrate the system’s capability for human detection. Normal vision can detect humans up to a distance of about 300m in daylight. At nighttime, typical headlights illuminate the road up to a distance of about 60m [Ref 4]. The minimum detection range should be the aforementioned distances in daylight and nighttime. (c) Develop a compact prototype imaging system that is small, lightweight, and low power, suitable for portability by personnel and small autonomous platforms (UxVs).

 

PHASE III DUAL USE APPLICATIONS: Perform additional experiments in a variety of situations and further refine the system for transition and commercialization. Ensure that the real-time imaging system is operable in real-world dynamic environments, thus extending the imaging to handle real-time acquisition, that is, at least 30 fps. This technology could be used in the commercial sector for self-driving cars, and in surveillance and navigation on any land or air vehicle.

 

REFERENCES:

  1. Achar, Supreeth et al. “Epipolar Time-of-Flight Imaging.” ACM Transactions on Graphics, Vol. 36, No. 4, Article 37, July 2017. https://dl.acm.org/doi/pdf/10.1145/3072959.3073686.
  2. Garg, D. et al. “Wasserstein Distances for Stereo Disparity Estimation.” 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada. https://arxiv.org/pdf/2007.03085.pdf.
  3. Gupta, M. et al. “What are Optimal Coding Functions for Time-of-Flight Imaging?” ACM Transactions on Graphics, Vol. 37, No. 2, Article 13, February 2018. https://dl.acm.org/doi/pdf/10.1145/3152155.
  4. Farber, Gene. “Seeing with Headlamps.” NHTSA Workshop on Headlamp Safety Metrics, Washington, DC, July 13, 2004. https://pdf4pro.com/view/seeing-with-headlights-4b1377.html
  5. Wang, Yan et al. “Pseudo-LiDAR From Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving.” CVPR 2019. https://arxiv.org/pdf/1812.07179.pdf.

 

KEYWORDS: Lidar-like 3D imaging sensor; hybrid imaging; high-resolution sensor with large field of vision; FOV; outdoor imaging; indoor imaging

US Flag An Official Website of the United States Government