Absract

The aim of this project is to try a tower stacking game using Lynxmotion’s AL5D robot arm and a commonly used webcam.

The robot arm use a general USB Webcam to ascertain information such as the position, color and distance of blocks and dice. Also, I use monocular depth estimation method to ascertain distance instead of using a depth camera for saving budget.

Table of Contents

  1. INTRODUCTION
  2. PRELIMINARIES
    1. Hardware Description
    2. Robot Arm Control
    3. Tower Game Rule
    4. Object detection and segmentation, depth estimation
    5. Model-Based Deep Reinforcement Learning
  3. PLAYING A BLOCK STACKING GAME USING AFFORDABLE MANIPULATOR AND CAMERA
  4. EXPERIMENTAL VALIDATION
  5. DISCUSSION
  6. CONCLUSION
  7. CODE FOR PAPER
  8. REFERENCE

INTRODUCTION

PRELIMINARIES

Hardware Description

  1. Lynxmotion AL5D 4 Degrees of Freedom Robotic Arm Combo Kit (BotBoarduino)

  2. Logitech Webcam

  3. Coogam Wooden Tower Stacking Game

  4. Lynxmotion PS2 Controller V4

Robot Arm Structure

Tower Game Rule

Object detection and segmentation

The robot must detect the blocks in the picutre. Even when multiple blocks overlap, each block must be detected. Thus, I decide to use a Mask R-CNN that has a segmentaion function in addition to detection.

Link of dataset for Mask R-CNN

For this, I create a data set by arranging actual block, dice, base plate of tower game at various angles and taking pictures. Thereafter, labeling is performed using a VGG Image Annotator.

After creating a data set, I train it on Mask R-CNN and confirm the prediction results.

After trarning a single image, I apply Mask R-CNN to the image being streamed from webcam. Also, the 2D information and the depth information of the object are used together as inputs in the Neural Network of robot agent, the output of Mask R-CNN and Monocular Depth Estimation must be displayed on same screen.

Mask R-CNN test 2 Click to Watch!

As a result of executing while moving the camera at a various of angles, it is determined that detection and segmentation are not completely performed. Perhaps I need more training data.

Mask R-CNN test 3 Click to Watch!

In order to apply Reinforcement Learning, a reward must be given when the robot arm reaches a specific place. In addition, the gripper part of the robot arm must also be recognized by Mask R-CNN.

Mask R-CNN test 4 Click to Watch!

PLAYING A BLOCK STACKING GAME USING AFFORDABLE MANIPULATOR AND CAMERA

EXPERIMENTAL VALIDATION

DISCUSSION

CONCLUSION

CODE FOR PAPER

  1. Arduino code for BotBoarduino
  2. Python code for arm robot and webcam

REFERENCE

  1. Deisenroth, Marc & Rasmussen, Carl & Fox, Dieter. (2011). Learning to Control a Low-Cost Manipulator using Data-Efficient Reinforcement Learning. 10.15607/RSS.2011.VII.008.
  2. Alhashim, I., & Wonka, P. (2018). High Quality Monocular Depth Estimation via Transfer Learning. arXiv preprint arXiv:1812.11941.
  3. Waleed Abdulla. (2018). Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow. GitHub repository