Robotic Reach, Grasp, and Pick-and-Place using Combined Reinforcement Learning and Traditional Controls

Lobbezoo, Andrew

UWSpace will be migrating to a new version of its software from July 29th to August 1st. UWSpace will be offline for all UW community members during this time.

Show simple item record

dc.contributor.author	Lobbezoo, Andrew
dc.date.accessioned	2022-09-01 15:05:05 (GMT)
dc.date.available	2023-09-02 04:50:05 (GMT)
dc.date.issued	2022-09-01
dc.date.submitted	2022-08-23
dc.identifier.uri	http://hdl.handle.net/10012/18696
dc.description.abstract	Electrically actuated robotic arms have been implemented to complete tasks which are repetitive, strenuous, and/or dangerous since they were first developed in the 1970s. More than 50 years have passed since initial development; however, robots in factories today are still operated with the conventional control strategies requiring individual programming on a task-by-task basis, with no margin for error. The implementation of conventional controls relies on experienced technicians and skilled robotic engineers sending commands on graphical or text-based programming interfaces to perform simple actions. Although automation has been shown to drastically increase productivity and reduce workplace injuries, the initial time and R&D cost for setting up robotic agents with traditional methods is presently too large for many firms. As an alternative to traditional operation planning and task programming, machine learning has shown significant promise with the development of reinforcement learning (RL) based control strategies. With RL, robotic agents can be presented with a task which they learn to solve through the exploration of various action sequences in the real world, or on internal simulated models of the environment. There are some existing RL applications; however, most examples are based on relatively simple video games and basic robotic tasks (inverted pendulum, vector-based reach, and so on). Additionally, the documentation for much of this research is limited, there is little real-world testing, and there is room for significant improvement in performance. The objective of this project is to implement RL based control strategies in simulated and real environments to validate the RL approach for standard industrial tasks such as reach, grasp, and pick-and-place. The goal for this approach is to bring intelligence to robotic control so that tasks can be completed without precisely defining the environment, target object positions, and action plan. To achieve the primary objective of this research, the following sub-objectives were pursued: 1) develop a custom simulation task environment, 2) create an RL pipeline for tuning and training a robotic RL agent, 3) develop a methodology for a novel semi-supervised RL system for improving image-based RL, 4) setup the Panda robot and establish a communication, control, and path planning system, and 5) tune, train, and test simulated and real-world RL based control. After developing the environments, creating the training and tuning framework, and establishing the real-world robotic control in objectives 1 to 4, extensive training and testing was conducted in objective 5. Results from testing showed that model performance was highly dependent on task difficulty. A high task completion rate was the outcome from training an RL network in simulation with coordinate-based positional feedback. For this simulation set, the robotic agents were able to independently learn to complete tasks with a high precision and repeatability. The outcome from training a network in simulation with image-based positional feedback was respectively poor. For the image-based tasks, the agent converged on sub-optimal solutions and underperformed expectations due to difficulties training the CNN positional location extractor. To overcome the issues with image-based RL training, the novel semi-supervised RL approach was implemented and tested. The results from this testing indicate that RL training performed well with image-based inputs given a pre-trained feature extractor. The semi-supervised methodology shows potential; however, this approach has the downside of requiring additional data collection for supervised training. After training in simulation, real-world reach, grasp, and pick-and-place testing was completed with coordinate-based positional inputs. The real-world testing validated the communication framework between the simulated and real environments and indicated that real-world policy transference was possible. Accuracy of coordinate-based reach, grasp, and pick-and-place was reduced by 10-20% compared to the simulation environment, which indicates that additional model calibration is required. The results from this research provide optimistic preliminary data on the application of RL to robotics. Further research, which is required to bridge the gaps on image-based learning, should include network generalization, domain adaptation, and imitation learning.	en
dc.language.iso	en	en
dc.publisher	University of Waterloo	en
dc.subject	simulation environment	en
dc.subject	reinforcement learning	en
dc.subject	Markov decision process	en
dc.subject	machine learning	en
dc.subject	robotic control	en
dc.subject	pick-and-place	en
dc.title	Robotic Reach, Grasp, and Pick-and-Place using Combined Reinforcement Learning and Traditional Controls	en
dc.type	Master Thesis	en
dc.pending	false
uws-etd.degree.department	Mechanical and Mechatronics Engineering	en
uws-etd.degree.discipline	Mechanical Engineering	en
uws-etd.degree.grantor	University of Waterloo	en
uws-etd.degree	Master of Applied Science	en
uws-etd.embargo.terms	1 year	en
uws.contributor.advisor	Kwon, Hyock Ju
uws.contributor.affiliation1	Faculty of Engineering	en
uws.published.city	Waterloo	en
uws.published.country	Canada	en
uws.published.province	Ontario	en
uws.typeOfResource	Text	en
uws.peerReviewStatus	Unreviewed	en
uws.scholarLevel	Graduate	en

Files in this item

Name:: Lobbezoo_Andrew.pdf
Size:: 3.814Mb
Format:: PDF

View/ Open

This item appears in the following Collection(s)

Show simple item record