3D pointing gestures as target selection tools: guiding monocular UAVs during window selection in an outdoor environment

Anna C.S. Medeiros, Photchara Ratsamee, Jason Orlosky, Yuki Uranishi, Manabu Higashida, Haruo Takemura

Research output: Contribution to journalArticlepeer-review

10 Scopus citations


Firefighters need to gain information from both inside and outside of buildings in first response emergency scenarios. For this purpose, drones are beneficial. This paper presents an elicitation study that showed firefighters’ desires to collaborate with autonomous drones. We developed a Human–Drone interaction (HDI) method for indicating a target to a drone using 3D pointing gestures estimated solely from a monocular camera. The participant first points to a window without using any wearable or body-attached device. Through the drone’s front-facing camera, the drone detects the gesture and computes the target window. This work includes a description of the process for choosing the gesture, detecting and localizing objects, and carrying out the transformations between coordinate systems. Our proposed 3D pointing gesture interface improves on 2D interfaces by integrating depth information with SLAM and solving ambiguity with multiple objects aligned on the same plane in a large-scale outdoor environment. Experimental results showed that our 3D pointing gesture interface obtained average F1 scores of 0.85 and 0.73 for precision and recall in simulation and real-world experiments and an F1 score of 0.58 at the maximum distance of 25 m between the drone and building.

Original languageEnglish (US)
Article number14
JournalROBOMECH Journal
Issue number1
StatePublished - Dec 2021
Externally publishedYes


  • Gestural interface
  • Gesture development process
  • Human–Drone interaction
  • Object selection
  • Pointing gesture

ASJC Scopus subject areas

  • Modeling and Simulation
  • Instrumentation
  • Mechanical Engineering
  • Control and Optimization
  • Artificial Intelligence


Dive into the research topics of '3D pointing gestures as target selection tools: guiding monocular UAVs during window selection in an outdoor environment'. Together they form a unique fingerprint.

Cite this