Snapbot: Enabling Dynamic Human-Robot Interactions for Real-Time Computational Photography

¹ Hanyang University

HRI 2024 · Late-Breaking Report

TL;DR

Snapbot is a manipulator-based photography system that frames, composes, and captures stylized portraits in real time — treating the human subject as a dynamic interaction partner rather than a static target. The robot continuously updates its camera viewpoint as the subject moves, applies stylization on the fly, and timing-locks the shutter to compositional cues.

Why this matters

Existing robotic photography demos either (a) wait for the human to hold still and then take a fixed shot, or (b) record continuously and post-process. Both miss the interactive feedback loop that makes a portrait good: the photographer suggesting a pose, the subject responding, the photographer re-framing. Snapbot is an attempt to put a robot inside that loop.

What we built

Platform: a 6-DoF manipulator with an end-effector-mounted camera, framed as a real-time vision-control system.
Composition policy: a lightweight head-pose and rule-of-thirds estimator that scores candidate viewpoints and drives the manipulator toward the highest-scoring one.
Stylization: an AnimeGAN filter pipeline that runs on the live preview, so the subject sees the stylized output as they pose.
Shutter timing: triggers when the composition score crosses a threshold and the subject’s head pose is stable for ≥ 200ms — captures the intentional pose, not the transition.

What we showed at HRI 2024

A live demo of the system framing, composing, and capturing stylized portraits with the subject continuously moving. The contribution is the interaction model, not the imaging pipeline — making the manipulator a participant in the portrait session rather than a tripod with extra steps.

This system grew out of the AnimeGAN Filter Photographer project (SW Talent Festival 2023, ICT President’s Award), where the same composition + stylization stack ran under safety-RL constraints on a fixed-base arm. Snapbot generalizes that work to dynamic, free-moving subjects.

What’s next

Subject-driven framing: let the human nudge the robot’s framing via gesture or gaze rather than only by moving their head.
Multi-subject scenes: group portraits with active recomposition as people enter/leave the frame.

BibTeX

@inproceedings{choi2024snapbot,
  title     = {Snapbot: Enabling Dynamic Human-Robot Interactions for Real-Time Computational Photography},
  author    = {Choi, Chanyeok and Lee, Youngmoon},
  booktitle = {ACM/IEEE International Conference on Human-Robot Interaction (HRI), Late-Breaking Report},
  year      = {2024},
}