Human-Robot Mutual Adaptation in Shared Autonomy

Stefanos Nikolaidis, Yu Xiang Zhu, David Hsu, Siddhartha Srinivasa

In shared autonomy, user inputs and robot autonomy are combined to control a robot to achieve the user’s intended goal. However, if the operator is unaware of the robot’s capabilities and limitations, they may guide the robot towards a suboptimal goal. On the other hand, the robot may know the optimal way of completing the task. Our objective is to improve team performance by having the robot guide the operator towards a new goal, while retaining their trust. We achieve this through a human-robot mutual adaptation formalism. We integrate a bounded-memory adaptation model of the human into a partially observable stochastic model, which enables robot adaptation to the human: when the human is adaptable, the robot will guide the human towards an optimal goal, unknown to them in advance. Otherwise, it will adapt to the human retaining their trust. Contrary to the collaborative scenarios examined in previous work, we account for partial observability of human and robot goals, and we explicitly penalize disagreement between the operator and the robot. We show in a human subject experiment that the proposed formalism significantly improved human-robot team performance, compared to the robot following participants’ preference, while retaining a high level of operator trust in the robot.

Improving Robot Controller Interpretability and Transparency Through Autonomous Policy Explanation

Bradley Hayes, Julie Shah

Shared expectations and mutual understanding are critical facets of teamwork. Achieving these in human-robot collaborative contexts can be especially challenging, as humans and robots are unlikely to share a common language to convey intentions, plans, or justifications. Even in cases where human co-workers can inspect a robot’s control code, and particularly when statistical methods are used to encode control policies, there is no guarantee that meaningful insights into a robot’s behavior can be derived or that a human will be able to efficiently isolate the behaviors relevant to the interaction. We present a series of algorithms and an accompanying system that enables robots to autonomously synthesize policy descriptions and respond to both general and targeted queries by human collaborators. We demonstrate applicability to a variety of robot controller types including those that utilize conditional logic, tabular reinforcement learning, and deep reinforcement learning, synthesizing informative policy descriptions for collaborators and facilitating fault diagnosis by non-experts.

Is a Robot a Better Walking Partner If It Associates Utterances with Visual Scenes?

Ryusuke Totsuka, Satoru Satake, Takayuki Kanda, Michita Imai

We aim to develop a walking partner robot with the capability to select small-talk topics that are associative to visual scenes. We first collected video sequences from five different locations and prepared a dataset about small-talk topics associated to visual scenes. Then we developed a technique to associate the visual scenes with the small-talk topics. We converted visual scenes into lists of words using an off-the-shelf vision library and formed a topic space with a Latent Dirichlet Allocation (LDA) method in which a list of words is transformed to a topic vector. Finally, the system selects the most similar utterance in the topic vectors. We tested our developed technique with a dataset, which successfully selected 72% appropriate utterances, and conducted a user study outdoors where participants took a walk with a small robot on their shoulder and engaged in small talk. We confirmed that the participants more highly perceived the robot with our developed technique because it selected appropriate utterances than a robot that randomly selected utterances. Further, they also felt that the former type of robot is a better walking partner.

Game-Theoretic Modeling of Human Adaptation in Human-Robot Collaboration

Stefanos Nikolaidis, Swaprava Nath, Ariel Procaccia, Siddhartha Srinivasa

In human-robot teams, humans often start with an inaccurate model of the robot capabilities. As they interact with the robot, they infer the robot’s capabilities and partially adapt to the robot: they might change their actions based on the observed outcomes of their own and the robot’s actions, without adopting the robot policy as their own. We present a game-theoretic model of human partial adaptation to the robot, where the human responds to the robot actions by maximizing a reward function that changes stochastically over time, capturing the evolution of their expectations of the robot capabilities. The robot can then use this model to decide optimally between taking actions that reveal its capabilities to the human and taking the best action given the information that the human currently has. We prove that under certain observability assumptions, the optimal policy can be computed efficiently. We demonstrate through a human subject experiment that the proposed model significantly improves human-robot team performance, compared to policies that assume complete adaptation of the human to the robot.

Towards Adaptive Social Behavior Generation for Assistive Robots Using Reinforcement Learning

Jacqueline Hemminghaus, Stefan Kopp

In this paper we explore whether a social robot can learn – in and from a task-oriented interaction with a human user – how to employ different social behaviors to achieve interactional goals in specific situational circumstances. We present a multimodal behavior generation architecture that maps high-level behaviors with interactional functions onto low-level behaviors executable by a robot. While high-level behaviors are selected based on the state of the user as well as the interaction, reinforcement learning (Q-learning) is used within each behavior to adapt its local mapping onto lower-level behaviors. The approach is implemented and applied in a scenario in which a social robot (Furhat) assists a human player in solving a Memory game by guiding the attention of the user to specific objects. Results of an evaluation study are reported which demonstrate that participants are able to solve the Memory faster with the adaptive, assistive robot.

The When, Where, and How: An Adaptive Robotic Info-Terminal for Care Home Residents – A long-term Study

Marc Hanheide, Denise Hebesberger, Tomas Krajnik

Adapting to users’ intentions is a key requirement for autonomous robots in general, and in care settings in particular. In this paper, a comprehensive long-term study of a mobile robot providing information services to residents, visitors, and staff of a care home is presented with a focus on adapting to the when and where the robot should be offering its services to best accommodate the users’ needs. Rather than providing a fixed schedule, the presented system takes the opportunity of long-term deployment to explore the space of possibilities of interaction while concurrently exploiting the model learned to provide better services. But in order to provide effective services to users in a care home, not only then when and where are relevant, but also the way how the information is provided and accessed. Hence, also the usability of the deployed system is studied specifically, in order to provide a most comprehensive overall assessment of a robotic info-terminal implementation in a care setting. Our results back our hypotheses, (i) that learning a spatiotemporal model of users’ intentions improves efficiency and usefulness of the system, and (ii) that the specific information sought after is indeed dependent on the location the info-terminal is offered.

Event Timeslots (1)

Wed, Mar 8
3:30 pm - 5:00 pm
Robots and People Adapting to Each Other