• Perceiving the 3d world from single images
  • Aleotti, Filippo <1993>

Subject

  • ING-INF/05 Sistemi di elaborazione delle informazioni

Description

  • Depth represents a crucial piece of information in many practical applications, such as obstacle avoidance and environment mapping. This information can be provided either by active sensors, such as LiDARs, or by passive devices like cameras. A popular passive device is the binocular rig, which allows triangulating the depth of the scene through two synchronized and aligned cameras. However, many devices that are already available in several infrastructures are monocular passive sensors, such as most of the surveillance cameras. The intrinsic ambiguity of the problem makes monocular depth estimation a challenging task. Nevertheless, the recent progress of deep learning strategies is paving the way towards a new class of algorithms able to handle this complexity. This work addresses many relevant topics related to the monocular depth estimation problem. It presents networks capable of predicting accurate depth values even on embedded devices and without the need of expensive ground-truth labels at training time. Moreover, it introduces strategies to estimate the uncertainty of these models, and it shows that monocular networks can easily generate training labels for different tasks at scale. Finally, it evaluates off-the-shelf monocular depth predictors for the relevant use case of social distance monitoring, and shows how this technology allows to overcome already existing strategies limitations.

Date

  • 2022-06-14

Type

  • Doctoral Thesis
  • PeerReviewed

Format

  • application/pdf

Identifier

urn:nbn:it:unibo-28541

Aleotti, Filippo (2022) Perceiving the 3d world from single images, [Dissertation thesis], Alma Mater Studiorum Università di Bologna. Dottorato di ricerca in Monitoraggio e gestione delle strutture e dell'ambiente - sehm2 , 34 Ciclo. DOI 10.48676/unibo/amsdottorato/10228.

Relations