There are currently various attempts to apply Deep Reinforcement to actual robot. The characteristics of Reinforcment Learning require over hundred of the initialization of the environment and many data for Deep Learning making it impossible to train the robot in the actual environment.

To overcome these difficulties, people first create a simulation that looks exactly like the real world. In this simulation, training is advanced for the virtual robot. Since then, many researches try to use the tranined model for an actual robot.

In this post, I’ll try to confirm these methods are actually working. The test uses a robot called Jetbot that has a GPU on its main board. And the environment is a soccer that is familiar to many people.

Code for that post can be found on the DeepSoccer Github.

Project Introduction Project Introduction!

Table of Contents

  1. Design of DeepSoccer
    1. Main Board
    2. OLED
    3. Jupyter notebook
    4. Wheel
    5. Roller
    6. Solenoid
    7. Lidar and Infrared
    8. ROS Packaging
    9. Teleoperation
  2. Environment of DeepSoccer
    1. Real and simulation environemnt
    2. Training robot on simulation environment
    3. Testing robot on real environment
  3. Deep Learning of DeepSoccer
    1. Deep Reinforcement Learning
    2. Floor Segmentation
    3. Generative_Adversarial_Network

Robot design

Unlike general driving robots, soccer robots need to move freely not only in front and behind, but also to the left and right. Furthermore, it is necessary to have the part for holding and kicking ball. In addition, embedeed board is needed which has a capacity for running 3 models of Deep Learning together.

I use the Onshape cloud 3D modeling service to create a model.

You can see and download my 3D model.

After making 3D model, I convert it to URDF format for Gazebo simulation using onshape-to-robot.

Main Board

The main board used in DeepSoccer is Jetson Xavier NX which can run three Deep Learning model at the same time unlike Jetson Nano which can handle only one model one time. However, Jetson Xavier NX is little expensive compared to Jetson Nano.

This board can run ROS, Python, and Tensorflow. Therefore, you can develop yout own application easily. It is also very convenient to use sensors and actuators as it supports almost all communication interfaces such as I2C, SPI, Digital input / output, CSI and USB.

  1. NVIDIA Jetson Xavier NX

Purchases can generally be made through Amazon. Unlike the Jetson Nano, which can be operated with three 18650 batteries, the Jetson Xavier NX requires four 18650 batteries to operate stably.

  1. 18650 Battery Holder 4 Slots
  2. Lithium Battery Protection Board
  3. Lithium Battery Voltage Indicator
  4. L298 Motor Driver


DeepSoccer has OLED display like an original Jetbot to monitor IP Address, memory usage without monitor connection.

  1. Waveshare OLED module

Jetson Xavier NX is connected to OLED module by using VDC, GND and SCL, SDA of 0 Channel I2C.

After connecting the hardware, download the Jetbot package from to Jetson Xaiver NX and install it using file. In this package, execute a python file( that displays the current information in OLED.

After that, try to register sevice to execut OLED file automatically when the board boot. First, move to /etc/systemd/system/ location of Ubuntu. Then, create a file named deepsoccer_stats.service with following contents.

Description=DeepSoccer stats display service
ExecStart=/bin/sh -c "python3 /home/kimbring2/jetbot/jetbot/apps/"

Then, register the file as a service and start it as shown below.

$ systemctl daemon-reload
$ systemctl enable deepsoccer_stats
$ systemctl start deepsoccer_stats

The registered service can be confirmed with the following command.

sudo systemctl status deepsoccer_stats

Jupyter Notebook

As with the original Jetbot, you can use a Jupyter Notebook to test a your code without connecting with monitor. First, insall Jupyter package and creates a configuration file using the following command.

$ pip3 install jupyterlab
$ jupyter notebook --generate-config

Next, open ipython and generate a hash to set the password.

$ ipython

In [1]: from IPython.lib import passwd

In [2]: passwd()
Enter password: 
Verify password: 
Out[2]: 'sha1:60f3ac9aec93:be2d6048e9b1e7ae0f1ccbad9d746734bf5c3797'

Next, record generated hash in the jupyter_notebook_config.json file created at previous step.

$ sudo vi ~/.jupyter/jupyter_notebook_config.json

c = get_config()
c.NotebookApp.ip = ''
c.NotebookApp.open_browser = False
c.NotebookApp.port = 8080
c.NotebookApp.password = 'sha1:60f3ac9aec93:be2d6048e9b1e7ae0f1ccbad9d746734bf5c3797'

Finally, start Jupyter Notebook with the command below and enter the password you set earlier.

$ jupyter notebook


The U2D2 board and U2D2 power hub can be purchased at the Robotis shopping mall. However, if you have existing one, you can change only the 12V power supply method and use the rest as it is.

  1. U2D2
  2. U2D2 Power Hub
  3. Dynamixel MX-12W
  4. Omniwheel(Korea local shop)

ID and communication method and firmware version of of Dynamixel can given via a program provided by ROBOTIS. I test a motor seperately before installing it to robot body. A power supply and TTL communication can be done by using a U2D2 board and power hub.

After confirming only the operation of the motor separately, the motor and the control, power board are combined to the robot body.

In the case of dynamixel, the initial mode is the joint mode. Mode is needed to be changed to the wheel mode for using soccer robot. This can be achieved by setting the CCW Angle Limit of motor to 0. To rotate the motor set in the wheel mode at a specific speed, you just need to give a specific value to Moving Speed

Next, I test adjusting the speed of dynamixel using rostopic, as in the previous Jetbot.

Dynamixel test Click to Watch!

The DeepSoccer uses an omniwheel that has a many sub wheel. In order to properly simulate this with Gazebo, we must make sure that each sub wheel rotates correctly. First, I check it using RViz in the same way as a main wheel.

Omniwheel RVIz test Click to Watch!

After completing the test with RVIz, the test is similarly performed with Gazebo. It is confirmed that when the friction with the floor is large, the phenomenon that the sub wheel do not rotate properly is occurred. Finding the optimal friction parameters will be an important task.

Omniwheel Gazebo test Click to Watch!


The mechanism for controlling the ball is composed of a rubber roller for fixing and a solenoid electromagnet for kicking.

  1. Engraving rubber roller(Made in Korea)

For the part for grabiing the soccer ball, I use a part of engraving roller. The core of the roller is made by 3D printer and connected to a DC motor which is included in origin Jetbot.

Roller standalone test Click to Watch!

After completing the operation test with the roller alone, it is mounted on the of the robot second body layer. Since then, I test whether robot can actually hold the ball.

Roller installed test Click to Watch!


For the part for kicking the soccer ball, I use a solenoid electromagnet. To control the solenoid electromagnet in the ROS, we should control GPIO using code. I use a GPIO library of Jetson ( provided by NVIDIA.

It is determined that directly connecting the solenoid motor directly to the 12V power supply can not enough force to kick the ball far. Thus, large capacity capacitor and charging circuit for it is added. Thankfully I could find circuit and component for this at

  1. Large Capacity Capacitor
  2. Large Capacitor Charger
  3. Limit switch(Relay module can replace it)

After a 250v 1000uf capacitor and a ±45V-390V capacitor charger are added, a solenoid can push a heavy a billiard ball to considerable distance.

Solenoid test 5 Click to Watch!

Because the wheel and roller part have all simple structure, they can be implemented using the default controller of Gazebo. However. the solenoid part need a custom controller since it has spring part. You can find a official tutorial for that at

namespace gazebo
	class SolenoidElectromagnetSpringPlugin : public ModelPlugin
		public: SolenoidElectromagnetSpringPlugin() {}     

public: void OnRosMsg(const std_msgs::Float32ConstPtr &_msg)
	std::cout << "_msg->data: " << _msg->data << std::endl;
	for (int i = 0; i < 10; i++)
	    this->joint->SetForce(0, 1000 * float(_msg->data));

protected: void OnUpdate()
	double current_angle = this->joint->Position(0);
	for (int i = 0; i < 5; i++)
		this->joint->SetForce(0, (this->kx * (this->setPoint - current_angle)));

After testing the tutorial plugin first, add OnRosMsg, OnUpdate function to ModelPlugin. The OnRosMsg function is for receiving message transmitted to rostopic which include force applied to the solenoid by battery. The OnUpdate function is a part for defining a force applied by the spring. You can download plugin file from Solenoid Electromagnet Plugin

  <plugin name="stick_solenoid_electromagnet_joint_spring" filename="">

Adding the above part to deepsoccer.gazebo after defining the plugin can make it possible to use a custom pluging in that joint.

Lidar and Infrared

Soccer robot need to check a obstacle of front side. Using only camera sensor is not enough for that. Thus, I decide adding lidar sensor.

The lidar sensor I use can measure only on the front side. It requires 5V, GND and UART GPIO pins of Jetson Nano.

lidar test 1 Click to Watch!

After checking operation of lidar sensor at simulation. I also check real lidar sensor can measure distance of front obstacle.

lidar test 2 Click to Watch!

Robot need know it is holding holding ball now because camera can not see a lower part of robot. In order to solve these problems, I decide to add an infrared obstacle detection sensor at side of the roller that holds the ball.

The infrared sensor uses GPIO as an input, as opposed to the solenoid which used GPIO as an output. If an obstacle is detected, the sensor gives a different signal to Jetson Nano.

Infrared test Click to Watch!

After mounting the infrared sensor in real robot, I also confirm that the same operation can be performed in infrared sensor of Gazebo simulation.


It makes sense that the default motions performed in Gazebo simulations can be done in a real robot also. In order to confirm tat, the motion control function of original version of Jetbot using the gamepad is slightly modified for Jetbot soccer version.

You probably remember the gamepad UI that is in orginal teleoperation.ipynb file. Only 0, 1 scroolbal is used for controlling the left and right wheel in that file. In my case, I set it as above for controlling roller, stick, omniwheel.

from dynamixel_sdk import *                    # Uses Dynamixel SDK library

def omniwheel_move(motor_id, speed):
    # Control table address
    ADDR_MX_TORQUE_ENABLE      = 24               # Control table address is different in Dynamixel model
    ADDR_MX_MOVING_SPEED       = 32

    # Protocol version
    PROTOCOL_VERSION            = 1.0               # See which protocol version is used in the Dynamixel
    # Default setting
    DXL_ID = motor_id
    #DXL_ID                      = 1                # Dynamixel ID : 1
    BAUDRATE                    = 57600             # Dynamixel default baudrate : 57600
    DEVICENAME                  = '/dev/ttyUSB0'    # Check which port is being used on your controller
                                                    # ex) Windows: "COM1"   Linux: "/dev/ttyUSB0" Mac: "/dev/tty.usbserial-*"

    TORQUE_ENABLE               = 1                 # Value for enabling the torque
    TORQUE_DISABLE              = 0                 # Value for disabling the torque
    DXL_MINIMUM_POSITION_VALUE  = 10           # Dynamixel will rotate between this value
    DXL_MAXIMUM_POSITION_VALUE  = 4000            # and this value (note that the Dynamixel would not move when the position value is out of movable range. Check e-manual about the range of the Dynamixel you use.)
    DXL_MOVING_STATUS_THRESHOLD = 20                # Dynamixel moving status threshold

    index = 0
    dxl_goal_position = [DXL_MINIMUM_POSITION_VALUE, DXL_MAXIMUM_POSITION_VALUE]         # Goal position

    # Initialize PortHandler instance
    # Set the port path
    # Get methods and members of PortHandlerLinux or PortHandlerWindows
    portHandler = PortHandler(DEVICENAME)

    # Initialize PacketHandler instance
    # Set the protocol version
    # Get methods and members of Protocol1PacketHandler or Protocol2PacketHandler
    packetHandler = PacketHandler(PROTOCOL_VERSION)

    # Open port
    if portHandler.openPort():
        #print("Succeeded to open the port")
        print("Failed to open the port")
        print("Press any key to terminate...")

    # Set port baudrate
    if portHandler.setBaudRate(BAUDRATE):
        #print("Succeeded to change the baudrate")
        print("Failed to change the baudrate")
        print("Press any key to terminate...")

    # Enable Dynamixel Torque
    #dxl_comm_result, dxl_error = packetHandler.write1ByteTxRx(portHandler, DXL_ID, ADDR_MX_TORQUE_ENABLE, TORQUE_ENABLE)
    dxl_comm_result, dxl_error = packetHandler.write2ByteTxRx(portHandler, DXL_ID, ADDR_MX_MOVING_SPEED, speed)
    #print("dxl_comm_result: " + str(dxl_comm_result))
    if dxl_comm_result != COMM_SUCCESS:
        print("%s" % packetHandler.getTxRxResult(dxl_comm_result))
    elif dxl_error != 0:
        print("%s" % packetHandler.getRxPacketError(dxl_error))
        #print("Dynamixel has been successfully connected")
# Stop
def stop():
import traitlets
from traitlets.config.configurable import Configurable

class OmniWheelStop(Configurable):
    value = traitlets.Float()

    def __init__(self, *args, **kwargs):
        super(OmniWheelStop, self).__init__(*args, **kwargs)  # initializes traitlets
    def _observe_value(self, change):

    def _write_value(self, value):
        """Sets motor value between [-1, 1]"""

    def _release(self):
        """Stops motor by releasing control"""
import RPi.GPIO as GPIO
import time

def solenoid_active():
    # Pin Definitions
    output_pin = 18  # BCM pin 18, BOARD pin 12
    # Pin Setup:
    GPIO.setmode(GPIO.BCM)  # BCM pin-numbering scheme from Raspberry Pi
    # set pin as an output pin with optional initial state of HIGH
    GPIO.setup(output_pin, GPIO.OUT, initial=GPIO.HIGH)

    print("Starting demo now! Press CTRL+C to exit")
    curr_value = GPIO.HIGH
    print("Outputting {} to pin {}".format(curr_value, output_pin))
    GPIO.output(output_pin, curr_value)
    #curr_value ^= GPIO.HIGH

class SolenoidKick(Configurable):
    value = traitlets.Float()

    def __init__(self, *args, **kwargs):
        super(SolenoidKick, self).__init__(*args, **kwargs)  # initializes traitlets
    def _observe_value(self, change):

    def _write_value(self, value):
        """Sets motor value between [-1, 1]"""

    def _release(self):
        """Stops motor by releasing control"""
from jetbot import Robot
import traitlets

robot = Robot()

solenoid_link = traitlets.dlink((controller.buttons[4], 'value'), (solenoid_kick, 'value'), transform=lambda x: -x)

roller_link = traitlets.dlink((controller.axes[1], 'value'), (robot.right_motor, 'value'), transform=lambda x: -x)

stop_link = traitlets.dlink((controller.buttons[5], 'value'), (omniwheel_stop, 'value'), transform=lambda x: -x)
forward_link = traitlets.dlink((controller.buttons[0], 'value'), (omniwheel_forward, 'value'), transform=lambda x: -x)
left_link = traitlets.dlink((controller.buttons[1], 'value'), (omniwheel_left, 'value'), transform=lambda x: -x)
right_link = traitlets.dlink((controller.buttons[2], 'value'), (omniwheel_right, 'value'), transform=lambda x: -x)
back_link = traitlets.dlink((controller.buttons[3], 'value'), (omniwheel_back, 'value'), transform=lambda x: -x)

Like the original method, using the gamepad requires combining Traitlets package of python and control code of hardware. Above code is a sample for I use for omniwheel control. First, Dynamixel SDK is used for creating function that gives direction and speed to each omniwheel, and then use these function to create function for the robot to move forward, turn left, turn right, reverse, and stop. Then, using this function, a configurable class of traitlets is created, and the class declared at the end is connected to the gamepad UI using the dlink function of Traitlets.

Teleoperation test 1 Click to Watch!

As you can see in the video, you can see the basic movement and ball dribbling are working well. However, kicking power is weak. It is considered that voltage , current are insufficient because battery and solenoid are directly connected. To solve these problem, I decide to use the method of collecting the high voltage using a large capacitor and then releasing it to the solenoid at once.

Teleoperation test 2 Click to Watch!

I test again basic soccer function after adding a large capacitor. It seems like that the kick function is reinforced and it can kick a balls to a long distance than before.

Environment of DeepSoccer

Real and simulation environemnt

A large amount of data is required to train a robot through the Deep Learning method. However, it is not easy to obtain this in a real environment due to physical restrictions. For that reason, this study trains a robot in a virtual environment to play soccer game and then uses that skill in a real environment. However, the skill learned in the virtual environment can not be immediately used in the real environment because there are differences between the virtual and the real environment. In order to solve these problems, we apply a method of generating a virtual environment image from input image of the real environment.

Training robot on simulation environment

Most Deep Reinforcement Learning researchers are accustomed to Gym environment of OpenAI. There is package called openai_ros that allows user use a custom robot environment in the form of Gym.

DeepSoccer also provides a package for use a it as Gym format. That package is based on the my_turtlebot2_training tutorial. I recommend you first running a tutorial package before doing DeepSoccer package.

After finishing TurtleBot2 tutorial, you need to add two file to each folder named task_envs, robot_envs of openai_ros package. One file is and other file is Finally, you should add DeepSoccer environment in like below.

    elif task_env == 'MyDeepSoccerSingleTest-v0':

        # import our training environment
        from openai_ros.task_envs.deepsoccer import deepsoccer_single_test

You can download my own openai_ros package including that files.

Second, download a my_deepsoccer_training pacakge. After that, copy it to the src folder under ROS workspace like a Jetbot package and build it.

Deep Learning of DeepSoccer

Deep Reinforcement Learning

After installing the my_deepsoccer_training package, you can use DeepSoccer with the following Gym shape. The basic actions and observations are the same as described in the Jetbot soccer section. Action is an integer from 0 to 6, indicating STOP, FORWARD, LEFT, RIGHT, BACKWARD, HOLD, and KICK, respectively. Observations are image frame from camera, robot coordinates, and lidar sensor value.

After making DeepSoccer in Openai Gym format, you can use it for training robot by Deep Reinforcement Learning. Currently, the most commonly used Deep Reinforcement Learning algorithms like PPO are good when the action of the agent is relatively simple. However, DeepSoccer agent has to deal with soccer ball very delicately. Thus, I assume that PPO alorithm do not work well in this project. For that reason, I decide to use a one of Deep Reinforcement Learning method “Forgetful Experience Replay in Hierarchical Reinforcement Learning from Demonstrations”, which operates in the complex environment like a soccer, by mixing trained agent data and expert demonstration data.

The code related to this algorithm is be located at ForgER folder.

Generative Adversarial Network

Unlike humans, robots cannot respond appropriately to environment that is different from the simulation environment. Therefore, the real world information must be converted to the simulation environment. Recently, there are several ways to apply deep learning to these Sim2Real. One of method is using Neural Style Transfer and another is applying CycleGAN. I apply both of methods to DeepSoccer and check it is working properly.

I use a code of for Neural Style Transfer. The advantage of this method is that you only need one conversion target style image without collecting train images separately, but this method does not completely convert to simulation.

Result of Neural Style Transfer at DeepSoccer

You can train your own model using code of that repo and real world image. Altenatively, you can also use the pretrained model of the DeepSoccer Gazebo simulation image.

import cv2
import numpy as np
import tensorflow as tf

imported_style = tf.saved_model.load("/home/[your Jetson Nano user name]/style_model")
f_style = imported_style.signatures["serving_default"]
style_test_input = np.zeros([1,256,256,3])
style_test_tensor = tf.convert_to_tensor(style_test_input, dtype=tf.float32)

def gstreamer_pipeline(
    return (
        "nvarguscamerasrc ! "
        "video/x-raw(memory:NVMM), "
        "width=(int)%d, height=(int)%d, "
        "format=(string)NV12, framerate=(fraction)%d/1 ! "
        "nvvidconv flip-method=%d ! "
        "video/x-raw, width=(int)%d, height=(int)%d, format=(string)BGRx ! "
        "videoconvert ! "
        "video/x-raw, format=(string)BGR ! appsink"
        % (

cap = cv2.VideoCapture(gstreamer_pipeline(flip_method=0), cv2.CAP_GSTREAMER)
if cap.isOpened() != 1:
    ret, frame =
    img = cv2.resize(frame, (256, 256), interpolation=cv2.INTER_AREA)
    img = cv2.normalize(img, None, 0, 1, cv2.NORM_MINMAX, cv2.CV_32F)
    resized = np.array([img])
    input_tensor = tf.convert_to_tensor(resized, dtype=tf.float32)
    output_style = f_style(input_tensor)['output_1'].numpy()

    cv2.imwrite("output_style.jpg", output_style)

You can save the pretrain model to your Jetson Nano and use the above code to try to run Neural Style Transfer.

The method using CycleGAN trains a model by dataset of real and simulation world. I find a code for that at method of official Tensorflow website.

DeepSoccer cyclegan test Click to Watch!

Floor Segmentation

As can be seen in the real world dataset, there are many objects in the background of the experimental site such as chair, and umbrella. If I train the CycleGAN model with the simulation world dataset without removing background objects, I am able to see the problem of the chair turning into goalpost.

Wrong generation of CycleGAN at DeepSoccer

In order to solve this problem, I first decide that it is necessary to delete all objects except the goal, goalpost, and floor that the robot should recognize to play soccer. Segmentation using classic OpenCV method do not work. On the other hand, Deep Learning model using the ADE20K dataset can segregate object well. Robot do not have to separate all the object in the dataset. Thus, I simplify the ADE20K dataset a bit like a below.

Simplified ADE20K image and mask

You can check code for segmentation. Furthermore, I confirm that the model trained using modified ADE20K dataset can distinguishe the wall and floor of camera image of real robot.

You can train your own model using code of that repo and simplified image. Altenatively, you can also use the pretrained model of mine and below code.

import cv2
import numpy as np
import tensorflow as tf

imported_seg = tf.saved_model.load("/home/[your Jetson Nano user name]/segmentation_model_official")
f_seg = imported_seg.signatures["serving_default"]
seg_test_input = np.zeros([1,256,256,3])
seg_test_tensor = tf.convert_to_tensor(seg_test_input, dtype=tf.float32)

def gstreamer_pipeline(
    return (
        "nvarguscamerasrc ! "
        "video/x-raw(memory:NVMM), "
        "width=(int)%d, height=(int)%d, "
        "format=(string)NV12, framerate=(fraction)%d/1 ! "
        "nvvidconv flip-method=%d ! "
        "video/x-raw, width=(int)%d, height=(int)%d, format=(string)BGRx ! "
        "videoconvert ! "
        "video/x-raw, format=(string)BGR ! appsink"
        % (

cap = cv2.VideoCapture(gstreamer_pipeline(flip_method=0), cv2.CAP_GSTREAMER)
if cap.isOpened() != 1:
    ret, frame =
    img = cv2.resize(frame, (256, 256), interpolation=cv2.INTER_AREA)
    img = cv2.normalize(img, None, 0, 1, cv2.NORM_MINMAX, cv2.CV_32F)
    resized = np.array([img])
    input_tensor = tf.convert_to_tensor(resized, dtype=tf.float32)
    output_seg = f_seg(input_tensor)['conv2d_transpose_4'].numpy()

    cv2.imwrite("output_seg.jpg", output_seg)

The floor have to be distinguished by the Deep Learning. However, the goal, goalpost have the primary colors such as green, and red. Thus, they can be found through the classic HSV conversion of OpenCV. You can see the original video and the result of applying each method in the video below at once.

DeepSoccer segmentation test Click to Watch!

you can try to make your own segmented image set for CycleGAN or get of mine from segmented image.