LCSR Seminar: Dan Bohus “Situated Interaction”
Situated language interaction is a complex, multimodal affair that extends well beyond the spoken word. When interacting, we use a wide array of non-verbal signals and incrementally coordinate with each other to simultaneously resolve several problems: we manage engagement, coordinate on taking turns, recognize intentions, and establish and maintain common ground as a basis for contributing to the conversation. Proximity and body pose, attention and gaze, head nods and hand gestures, prosody and facial expressions, all play very important roles in this process. And just like a couple of decades ago advances in speech recognition opened up the field of spoken dialog systems, current advances in vision and other perceptual technologies are again opening up new horizons — we are starting to be able to build machines that computationally understand these social signals and the physical world around them, and participate in physically situated interactions and collaborations with people.
In this talk, using a number of research vignettes from work we have done over the last decade at Microsoft Research, I will draw attention to some of the challenges and opportunities that lie ahead of us in this exciting space. In particular, I will discuss issues with managing engagement and turn-taking in multiparty open-world settings, and more generally highlight the importance of timing and fine-grained coordination in situated language interaction. Finally, I will conclude by describing an open-source framework we are developing that promises to simplify the construction of physically situated interactive systems, and in the process further enable and accelerate research in this area.
Dan Bohus is a Senior Principal Researcher in the Perception and Interaction Group at Microsoft Research. His work centers on the study and development of computational models for physically situated spoken language interaction and collaboration. The long term question that shapes his research agenda is how can we enable interactive systems to reason more deeply about their surroundings and seamlessly participate in open-world, multiparty dialog and collaboration with people? Prior to joining Microsoft Research, Dan obtained his Ph.D. from Carnegie Mellon University.