Human-Centred Agent Learning

Project Overview

Robots are foreseen to be used in a variety of home contexts and interacting with novice users, such as assisting the dependent elderly in daily housework (i.e., caring and unpacking groceries, fetching, and pouring a glass of water, etc). Enabling robots to adapt to their environment through learning context specific tasks would be necessary for them to be used adequately by non-programming users.

Several methods have been proposed to teach new skills to robots while keeping the human in the loop. Among these methods, the Learning from Demonstration (LfD) and Reinforcement Learning (RL) approaches are the most common one. However, the literature reports several issues with including human trainers in RL scenarios. Significant research reports a positive bias in RL rewards, and that human-generated reward signals change as the learning progress being inconsistent over time (the trainer adapts her strategy). Similarly, research has found that human demonstrations tended to become less robust over time, when using the LfD approach. This can be explained by the difficulty for human trainers to teach basic procedural motions. They generally tend to exaggerate their demonstrations or be more kind with time.

In education, a good instructor maintains a mental model of the learner’s state (what has been learned and what needs clarification). This helps the teacher to appropriately structure the upcoming learning tasks with timely feedback and guidance. The learner can help the instructor by expressing their internal state via communicative acts that reveal their understanding, confusion, and attention. Robot’s learning parameters, however, can be overwhelming for a novice user and may increase the human workload (by increasing inaccurate feedbacks, and hence decreasing the robot’s learning). The challenge lies on training humans to be efficient trainers and enabling them to plan, assess and manage the robot’s learning.

Another noticeable issue is the disengagement of humans during the training task. Teaching procedural skills to a robot learner can be time consuming and repetitive. This often results in increased noise in human feedback making their input less reliable. Some researchers have imagined several strategies for the robot to cope with this, such as detecting inconsistencies and asking for additional feedback. This project proposes to investigate how collaborative and competitive games could enable better quality feedback when robots are learning from humans. Inspired by instructional design, we will study how building teaching tools for human teachers can effectively improve the robot’s learning. We will also aim to engage the trainer longer by identifying and integrating a gamification element in the training.

Collaboration

This research is conducted in collaboration with UNSW Sydney.

Contact

Wafa Johal
Email: wafa.johal@unimelb.edu.au

Publications

Ornnalin Phaijit, Claude Sammut, and Wafa Johal. 2022. Let’s Compete! The Influence of Human-Agent Competition and Collaboration on Agent Learning and Human Perception. In Proceedings of the 10th International Conference on Human-Agent Interaction (HAI '22). Association for Computing Machinery, New York, NY, USA, 86–94. https://doi.org/10.1145/3527188.3561922