You are here

High Dimensional Nearest Neighbor Search

Description:

TECHNOLOGY AREA(S): Electronics, Information Systems, Sensors

OBJECTIVE:

This topic seeks research in geolocation of imagery and video media taken at near- ground level [1]. The research will explore hashing/indexing techniques (such as [2]) that match information derived from media to a global reference data. The reference data is composed of digital surface models (DSMs) of known geographical regions and features that can be derived from that surface data, together with limited "foundation data" of the same regions consisting of map data such as might be present in Open Street Maps and landcover data (specifying regions that are fields, vegetation, urban, suburban, etc.). Query data consists of images or short video clips that represent scenes covered by the digital surface model in the reference data, but may or may not have geo-tagged still images in the reference data from the same location.

Selected performers will be provided with sample reference data, consisting of DSM data and a collection of foundation data, and will be provided with sample query data. This sample data is described below. However, proposers might suggest other reference and query data that they will use to either supplement or replace government-furnished sample data. This topic seeks novel ideas for the representation of features in the query data and the reference data that can be used to perform retrieval of geo-located reference data under the assumption that the query data lacks geolocation information. The topic particularly seeks algorithmically efficient approaches such as hashing techniques for retrieval based on the novel features extracted from query imagery and reference data that can be used to perform matching using nearest neighbor approaches in feature space.

DESCRIPTION:

The reference data includes files consisting of a vectorized two-dimensional representation of a Digital Surface Model (DSM) [4], relative depth information, and selected foundation feature data. The foundation features will included feature categories such as the locations of roads, rivers, and man-made objects.


The desired output of a query is a location within meters of the ground truth location of the camera that acquired the imagery. In practice, only some of the queries will permit accurate geolocation based on the reference data, and in some cases, the output will be a candidate list of locations, such that the true location is within the top few candidates. It is reasonable to assume that there exists a reference database calculated from a global DSM with a minimum spatial resolution of 30 meters that may, in some locations, provide sub-meter spatial resolution. The foundation feature is at least as rich as that present in Open Street Maps, and can include extensive landcover data with multiple feature types. For the purpose of this topic, the reference data will not include images. Sample reference and query data representative of these assumptions, but of limited geographical areas, will be provided to successful proposers.


The topic seeks approaches that are more accurate than a class of algorithms that attempt to provide geolocation to a general region, such as a particular biome or continent. These algorithms are often based on a pure neural network approach, such as described in [3], and is unlikely to produce sufficient precise camera location information that is accurate to within meters.


The objective system, in full production, should be sufficiently efficient as to scale to millions of square kilometers of reference data, and should be able to process queries at a rate of thousands of square kilometers per minute. While a phase 2 system might provide a prototype at a fraction of these capabilities, a detailed complexity analysis is expected to support the scalability of the system.


The proposed approach may apply to only a subset of query imagery types. For example, the proposed approach may be accurate only for urban data, or only for rural scenes. The proposer should carefully explain the likely limitations of the proposed approach and suggest methods whereby query imagery could be filtered so that only appropriate imagery is processed by the proposed system.


Proposers who can demonstrate prior completion of all of the described Phase I activities may propose a "straight to Phase II" effort. In this case the novelty of the proposed feature and retrieval approach will be a consideration in determining an award.

PHASE I:

Based on the proposed approach for feature extraction, representation, and retrieval, develop a detailed prototype implementation plan , with pseudocode that establishes feasibility against a limited reference data set.

PHASE II:

Build and test the module designed in Phase 1. Conduct an operational prototype and/or capability demonstration.

PHASE III:

This capability should allow users to restore location metadata to some percentage of media data that has been stripped of its metadata. This capability might assist in identifying archived imagery to perform legacy analysis, or assist in categorizing and organizing albums of media. This capability is of interest to commercial and government concerns alike.

KEYWORDS: digital surface models (DSMs); hashing/indexing; geolocation

References:

1. G. Baatz, O. Saurer, K. Koser and M. Pollefeys, "Large Scale Visual Geo-Location of Images in Mountainous Terrain," Proceedings of the 12th European Conference on Computer Vision (ECCV), vol. 7573, pp. 517-530, 2012.

2. A. Andoni and P. Indyk, "Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions," Comm. ACM, vol. 51, pp. 117-122, 2008.

3. T. Weyand, I. Kostrikov and J. Philbin, "PlaNet - Photo Geolocation with Convolutional Neural Networks," European Conference on Computer Vision (ECCV), pp. 6-7, 2016.

4. J. Zhu, N. Vander Valk, M. Bansal and H. Cheng, "Adaptive Rendering for Large-Scale Skyline Characterization and Matching," Computer Vision - ECCV 2012 - Workshops and Demonstrations, pp. 163-174, 2012.

5. F. Cong and C. Deng, "EFANNA: An Extremely Fast Approximate Nearest Neighbor Search Algorithm Based on kNN Graph," ArXiv, p. 1609.07228, 2016.

US Flag An Official Website of the United States Government