Woz4U - A configurable open-source interface for the Pepper robot
By Rietz, Finn
Woz4U - A configurable open-source interface for the Pepper robot
Hey, my name is Finn Rietz and for this year’s student exposition I would like to present to you WoZ4U, which is a fully configurable Interface for Softbank’s Pepper robot. The main purpose of the Interface is to provide a simple yet powerful and flexible tool to conduct Wizard-of-Oz like experiments, which is currently not available for the Pepper robot. This project is the result of my visit to the “Intelligent Robotics” Research Group in Umeå, Sweden during the (ongoing) 2020 summer break. We are still working on an empirical evaluation of the interface, and all code will be published under the GNU General Public v3.0 license alongside the paper.
Context
The Wizard-of-Oz methodology originates from human-computer interaction research and is used to investigate how an intelligent system would be perceived when the system is not yet technically feasible or simply not yet implemented. This is accomplished by having a human operator, who is referred to as the “wizard” and controls how the system responds to user input, without the user knowing about this human operator. So, the user or test participant thinks that he or she is interacting with a very sophisticated and advanced system, when, in fact, the user is interacting with another human, who controls the system. Naturally, this approach is applicable to human-robot interaction research as well, where the target system being investigated is a robotic platform, so the human operator, aka “wizard”, controls how the robot responds while engaged in an interaction with a human. As such, the Wizard-of-Oz method has been used extensively by the HRI community in the last decade to investigate interactions between humans and robots. As I mentioned earlier, WoZ4U, the tool presented in this exhibit, targets Softbank’s Pepper robot. Pepper is a semi-humanoid robot, equipped with some emotion recognition capabilities, and falls into the broad category of social robots, meaning it’s main purpose is to interact and communicate with humans, as opposed to, for example, a KUKA industrial robotic arm, that would primarily be used in some assembling pipeline. Hence, Pepper is a popular candidate for HRI research, where the Wizard of Oz method is often employed. However, there is no dedicated tool for such experiments available, and researchers have to spend significant time developing their own tools that provide sufficient control over Pepper. Thus, with the release of this tool, we not only aim to make HRI researchers more efficient, by allowing them to focus on the actual HRI, not the development of robot control tools but also try to make Pepper as a research platform more accessible for researchers outside of computer science, who might not have sufficient programming knowledge to develop such tools on their own.
The interface
WoZ4U is implemented as a simple Flask HTTP Server, making it independent of any particular operating system, lightweight with very few requirements, and combines the convenience of an HTML + JS + CSS frontend with a powerful Python backend.
Further, the interface is fully configurable via a central YAML configuration file, that stores settings of the robot and all context, aka experiment specific items. Accordingly, frontend control elements are generated for all context-specific items, so that they are easily accessible during the conduction of the experiment. Having this central configuration file has the side effect of ensuring reproducibility across participants, as the Pepper robot can easily be put into the predefined state saved in the configuration file. The below figure illustrates how the configuration file populates the frontend:
For the most part, WoZ4U can be considered as a wrapper for the main NAOqi API, and rarely extends it. Hence, the main contribution is an easy to use control platform for the Pepper robot, making Pepper more accessible, for example for research in the social sciences. Further, the main alternative system, Choregraphe, developed by Softbank, is not meant to be a real-time controller for Pepper interaction scenarios, making it extremely cumbersome to conduct such experiments with the system.
Supported Features
We conducted a short literature review to identify the functions most relevant for HRI research with Pepper and why the Wizard of Oz method is most commonly used, to begin with. Considering insights from this literature review, we implemented the following features in WoZ4U:
- View camera stream and listen to audio: Being able to see what’s happening in the interaction and listening to speech utterances of participants is a central requirement for al Wizard of Oz experiments. Thus, the interface provides direct access to Pepper’s inbuilt camera and microphone. Interestingly enough, audio transmission in real-time is not supported by the API and required some low-level signal processing shenanigans ;)
- Monitor tablet activity: We found many papers where Pepper’s tablet was actively used as a communication channel, for example by prompting participants to either press a “YES” or “NO” button on the tablet. Accordingly, the ability to monitor those events is valuable and we provide a live stream of touch events from Pepper’s mounted tablet to an HTML5 canvas.
- Autonomous life configuration: We provide control over all autonomous life capabilities of Pepper. These are arguably very important for any Pepper HRI experiment, as the control how Pepper responds to stimuli in it’s the environment, how engaged Pepper is in the interaction, and which pseudo lifelike (eg breathing animations) are active. The complete array of settings can be accessed from the configuration file, ensuring the same state of the robot even after a complete restart of the robot.
- Show tablet items: Related to the above bullet point, WoZ4U, based on paths or links provided in the configuration file, makes it possible to easily show and switch between different “tablet items”, like images, videos, or external website or web apps.
- Animated Speech: Context-specific predefined messages, like instructions or stories, can be provided via configuration file and can be uttered by Pepper at any time. These messages can be tagged with specific NAOqi tags, that allow embedding animations (gestures) into the text messages, that will then be executed when Pepper speaks the word at the location of the tag. This makes it possible to store complex, atomic interaction pieces in the backend, which can then be executed at the respective points in the experiment.
- Audio playback: Audio files can be provided via the configuration file and played at any time.
- Joint control and navigation: A simplistic controller for Pepper’s omnidirectional wheels is provided. Via keyboard shortcuts, Pepper’s position in the real world can be adjusted, in an incremental fashion. Further, Pepper’s head and hip joints can be controlled as well.
- Gestures: Even though gestures can implicitly be controlled via the “Animated Speech” module, we provide a dedicated section of the interface for the access to predefined animations directly. As with all content-dependent items of the interface, which concrete gestures are provided and made accessible either via mouse click or keyboard shortcut depends on the configuration file.
- Eye LEDs: RGB colors can be defined and stored in the configuration file, which can then be applied to Pepper RGB eye LEDs. The other LEDs on Pepper (shoulder and ears) are either not accessible via API or just have one color channel. Even though we didn’t find papers on this, whether Pepper looks at you with calm blue or furious red eyes likely has some effect on the interaction, which is why we decided to support the color assignment of the eye LEDs in the interface.
- Eye animation: The available LED animations (which are only three) are hardcoded and made accessible via the interface, for the same reason as the above bullet point, there is likely some effect to the interaction with the eye animations.
This concludes the high-level overview of the functionality provided in WoZ4U.