[1]
韦运余. 气振盘式精密播种流水线及控制系统设计与试验研究[D]. 镇江: 江苏大学, 2020.
Wei Yunyu. Design of control system and experimental study on assembly line for vacuumvibration tray precision seeding [D]. Zhenjiang: Jiangsu University, 2020.
[2]
赵毓, 管公顺, 郭继峰, 等. 基于多智能体强化学习的空间机械臂轨迹规划[J]. 航空学报, 2021, 42(1): 259-269.
Zhao Yu, Guan Gongshun,Guo Jifeng, et al. Trajectory planning of space manipulator based on multiagent reinforcement learning [J]. Acta Aeronautica et Astronautica Sinica, 2021, 42(1): 259-269.
[3]
Iriondo A, Lazkano E, Susperregi L, et al. Pick and place operations in logistics using a mobile manipulator controlled with deep reinforcement learning [J]. Applied Sciences, 2019, 9(2): 348.
[4]
Deng Z, Zhang J. Learning synergiesbased inhand manipulation with reward shaping [J]. CAAI Transactions on Intelligence Technology, 2020, 5(2): 1-9.
[5]
Wang Z, Li H, Wu Z, et al. A pretrained proximal policy optimization algorithm with reward shaping for aircraft guidance to a moving destination in threedimensional continuous space [J]. International Journal of Advanced Robotic Systems, 2021, 18(1): 1729881421989546.
[6]
Tang C Y, Liu C H, Chen W K, et al. Implementing action mask in proximal policy optimization (PPO) algorithm [J]. ICT Express, 2020, 6(3): 200-203.
[7]
Kim S, Jang I, Kim H, et al. Learning robot manipulation based on modular reward shaping [C]. 2020 International Conference on Information and Communication Technology Convergence (ICTC). IEEE, 2020: 883-886.
[8]
Sangiovanni B, Rendiniello A, Incremona G P, et al. Deep reinforcement learning for collision avoidance of robotic manipulators [C]. 2018 European Control Conference (ECC). IEEE, 2018: 2063-2068.
[9]
Dewey D. Reinforcement learning and the reward engineering principle [C]. 2014 AAAI Spring Symposium Series, 2014.
[10]
龚智强. 气吸振动盘式精密排种装置理论与试验研究[D]. 镇江: 江苏大学, 2013.
Gong Zhiqiang. Theoretical and experimental study on vacuumvibration tray precision seeding device [D]. Zhenjiang: Jiangsu University, 2014.
[11]
Schulman J, Levine S, Abbeel P, et al. Trust region policy optimization [C]. International Conference on Machine Learning. PMLR, 2015: 1889-1897.
[12]
张振, 黄炎焱, 张永亮, 等. 基于近端策略优化的作战实体博弈对抗算法[J]. 南京理工大学学报, 2021, 45(1): 77-83.
Zhang Zhen, Huang Yanyan, Zhang Yongliang, et al. Battle entity confrontation algorithm based on proximal policy optimization [J]. Journal of Nanjing University of Science and Technology, 2021, 45(1): 77-83.
[13]
Bellman R. A Markovian decision process [J]. Journal of mathematics and mechanics, 1957: 679-684.
[14]
Zhou D, Jia R, Yao H, et al. Robotic arm motion planning based on residual reinforcement learning [C]. 2021 13th International Conference on Computer and Automation Engineering (ICCAE). IEEE, 2021: 89-94.
[15]
祝亢, 黄珍, 王绪明. 基于深度强化学习的智能船舶航迹跟踪控制[J]. 中国舰船研究, 2021, 16(1): 105-113.
Zhu Kang, Huang Zhen, Wang Xuming. Tracking control of intelligent ship based on deep reinforcement learning [J]. Chinese Journal of Ship Research, 2021, 16(1): 105-113.
[16]
Schulman J, Wolski F, Dhariwal P, et al. Proximal policy optimization algorithms [J]. arXiv preprint arXiv: 1707.06347, 2017.
[17]
Zhang D, Bailey C P. Obstacle avoidance and navigation utilizing reinforcement learning with reward shaping [C]. Artificial Intelligence and Machine Learning for MultiDomain Operations Applications II. International Society for Optics and Photonics, 2020, 11413: 114131H.
[18]
杨惟轶, 白辰甲, 蔡超, 等. 深度强化学习中稀疏奖励问题研究综述[J]. 计算机科学, 2020, 47(3): 182-191.
Yang Weiyi, Bai Chenjia, Cai Chao, et al. Survey on sparse reward in deep reinforcement learning [J]. Computer Science, 2020, 47(3): 182-191.
[19]
Ng A Y, Harada D, Russell S. Policy invariance under reward transformations: Theory and application to reward shaping [C]. Proceedings of the Sixteenth International Conference on Machine Learning, 1999, 99: 278-287.
[20]
任建新, 黄民, 刘相权, 等. 基于Visual Studio与V-REP的货物拣选机器人联合仿真[J]. 重庆理工大学学报(自然科学), 2020, 34 (8): 87-94.
Ren Jianxin, Huang Min, Liu Xiangquan, et al. CoSimulating of cargo picking robot using Visual Studio and V-REP[J]. Journal of Chongqing Institute of Technology, 2020, 34 (8): 87-94.
[21]
占宏, 王剑城. 基于Web与V-REP的机器人远程控制虚拟仿真平台[J]. 计算技术与自动化, 2021, 40(2): 16-20.
Zhan Hong, Wang Jiancheng. Virtual simulation platform for robots remote control based on web and V-REP [J]. Computing Technology and Automation, 2021, 40(2): 16-20.
|