TY - GEN
T1 - Automated Action Evaluation for Robotic Imitation Learning via Siamese Neural Networks
AU - Chang, Xiang
AU - Chao, Fei
AU - Shang, Changjing
AU - Shen, Qiang
N1 - Funding Information:
*This work was supported by the Natural Science Foundation of Fujian Province of China (No. 2021J01002) 1X. Chang, F. Chao, C. Shang and Q. Shen are with Department of Computer Science, Institute of Mathematics, Physics and Computer Science, Aberystwyth University, SY23 3DB, UK 2F. Chao is also with Department of Artificial Intelligence, School of Informatics, Xiamen University, 361005, China, *Corresponding author
Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Despite recent advances in video-guided robotic imitation learning, many methods still rely on human experts to provide sparse rewards that indicate whether robots have successfully completed tasks. The challenge of enabling robots to autonomously evaluate whether their actions can complete complex, multi-stage tasks remains unresolved. In this work, we propose an efficient few-shot robotic learning algorithm that centres around learning and evaluating from a third-person perspective to address the aforementioned challenge. We develop a novel Siamese neural network-based robotic action-state evaluation system, named 'Behavior-Outcome Dual Assessment' (BODA), in our robotic imitation learning system, so as to replace artificial evaluations from human experts in multi-stage imitation learning processes and to improve learning efficiency. In this way, one video demonstration of a target task is divided into several stages. For each stage, we design two Siamese neural network-based evaluation modules in BODA: One module focuses on action changes, and the other handles working environment changes. The two modules work together to provide a comprehensive assessment of the robot's completion of each stage from the view of both the action and working environment changes. Then, BODA is integrated within a model-based reinforcement learning framework to enable the completion of our imitation learning cycle. Extensive experiments demonstrate that the evaluation processes of BODA can automatically and accurately evaluate task completion status without human intervention. In contrast to conventional methods, BODA is able to keep the accumulation of errors within acceptable limits through self-assessment in stages.
AB - Despite recent advances in video-guided robotic imitation learning, many methods still rely on human experts to provide sparse rewards that indicate whether robots have successfully completed tasks. The challenge of enabling robots to autonomously evaluate whether their actions can complete complex, multi-stage tasks remains unresolved. In this work, we propose an efficient few-shot robotic learning algorithm that centres around learning and evaluating from a third-person perspective to address the aforementioned challenge. We develop a novel Siamese neural network-based robotic action-state evaluation system, named 'Behavior-Outcome Dual Assessment' (BODA), in our robotic imitation learning system, so as to replace artificial evaluations from human experts in multi-stage imitation learning processes and to improve learning efficiency. In this way, one video demonstration of a target task is divided into several stages. For each stage, we design two Siamese neural network-based evaluation modules in BODA: One module focuses on action changes, and the other handles working environment changes. The two modules work together to provide a comprehensive assessment of the robot's completion of each stage from the view of both the action and working environment changes. Then, BODA is integrated within a model-based reinforcement learning framework to enable the completion of our imitation learning cycle. Extensive experiments demonstrate that the evaluation processes of BODA can automatically and accurately evaluate task completion status without human intervention. In contrast to conventional methods, BODA is able to keep the accumulation of errors within acceptable limits through self-assessment in stages.
UR - http://www.scopus.com/inward/record.url?scp=85168662386&partnerID=8YFLogxK
U2 - 10.1109/ICRA48891.2023.10161364
DO - 10.1109/ICRA48891.2023.10161364
M3 - Conference Proceeding (Non-Journal item)
AN - SCOPUS:85168662386
T3 - Proceedings - IEEE International Conference on Robotics and Automation
SP - 9537
EP - 9543
BT - Proceedings - ICRA 2023
PB - IEEE Press
T2 - 2023 IEEE International Conference on Robotics and Automation, ICRA 2023
Y2 - 29 May 2023 through 2 June 2023
ER -