Yuxiang Cui | publications

2020

Learning World Transition Model for Socially Aware Robot Navigation

Cui, Yuxiang, Zhang, Haodong, Wang, Yue, and Xiong, Rong

arXiv preprint arXiv:2011.03922 2020 Accepted in ICRA 2021

Abs arXiv HTML Code

Moving in dynamic pedestrian environments is one of the important requirements for autonomous mobile robots. We present a model-based reinforcement learning approach for robots to navigate through crowded environments. The navigation policy is trained with both real interaction data from multi-agent simulation and virtual data from a deep transition model that predicts the evolution of surrounding dynamics of mobile robots. A reward function considering social conventions is designed to guide the training of the policy. Specifically, the policy model takes laser scan sequence and robot’s own state as input and outputs steering command. The laser sequence is further transformed into stacked local obstacle maps disentangled from robot’s ego motion to separate the static and dynamic obstacles, simplifying the model training. We observe that the policy using our method can be trained with significantly less real interaction data in simulator but achieve similar level of success rate in social navigation tasks compared with other methods. Experiments are conducted in multiple social scenarios both in simulation and on real robots, the learned policy can guide the robots to the final targets successfully in a socially compliant manner.

2021

Socially-Aware Multi-Agent Following with 2D Laser Scans via Deep Reinforcement Learning and Potential Field

Cui, Yuxiang, Huang, Xiaolong, Wang, Yue, and Xiong, Rong

In 2021 IEEE International Conference on Real-time Computing and Robotics (RCAR) 2021

Abs arXiv HTML

Target following in dynamic pedestrian environments is an important task for mobile robots. However, it is challenging to keep tracking the target while avoiding collisions in crowded environments, especially with only one robot. In this paper, we propose a multi-agent method for an arbitrary number of robots to follow the target in a socially-aware manner using only 2D laser scans. The multi-agent following problem is tackled by utilizing the complementary strengths of both reinforcement learning and potential field, in which the reinforcement learning part handles local interactions while navigating to the goals assigned by the potential field. Specifically, with the help of laser scans in obstacle map representation, the learning-based policy can help the robots avoid collisions with both static obstacles and dynamic obstacles like pedestrians in advance, namely socially aware.While the formation control and goal assignment for each robot is obtained from a target-centered potential field constructed using aggregated state information from all the following robots. Experiments are conducted in multiple settings, including random obstacle distributions and different numbers of robots. Results show that our method works successfully in unseen dynamic environments. The robots can follow the target in a socially compliant manner with only 2D laser scans.
Learning Observation-Based Certifiable Safe Policy for Decentralized Multi-Robot Navigation

Cui, Yuxiang, Lin, Longzhong, Huang, Xiaolong, Zhang, Dongkun, Wang, Yue, and Xiong, Rong

2021 Under Review

Abs arXiv HTML

Safety is of great importance in multi-robot navigation problems. In this paper, we propose a control barrier function (CBF) based optimizer that ensures robot safety with both high probability and flexibility, using only sensor measurement. The optimizer takes action commands from the policy network as initial values and then provides refinement to drive the potentially dangerous ones back into safe regions. With the help of a deep transition model that predicts the evolution of surrounding dynamics and the consequences of different actions, the CBF module can guide the optimization in a reasonable time horizon. We also present a novel joint training framework that improves the cooperation between the Reinforcement Learning (RL) based policy and the CBF-based optimizer both in training and inference procedures by utilizing reward feedback from the CBF module. We observe that the policy using our method can achieve a higher success rate while maintaining the safety of multiple robots in significantly fewer episodes compared with other methods. Experiments are conducted in multiple scenarios both in simulation and the real world, the results demonstrate the effectiveness of our method in maintaining the safety of multi-robot navigation.
Human-Robot Motion Retargeting via Neural Latent Optimization

Zhang, Haodong, Li, Weijie, Liang, Yuwei, Chen, Zexi, Cui, Yuxiang, Wang, Yue, and Xiong, Rong

arXiv preprint arXiv:2103.08882 2021

2020

2021

2022