Hardware acceleration can revolutionize robotics enabling faster and more power-efficient robots. However, the diversity of acceleration options makes it difficult for roboticists to easily deploy accelerated systems without expertise in each specific hardware platform, specially when mixing with ROS 2. In our recent talk at ROSCon 2022, we presented the architectural pillars and conventions required to introduce hardware acceleration in ROS 2 in a scalable and technology-agnostic manner: the ROBOTCORE Framework (paper). This article summarizes our speech and experience.


Talk outline:


Hardware acceleration can revolutionize robotics enabling faster and more power-efficient robots. Through FPGAs, GPUs and other accelerators, ROS 2 developers can build acceleration kernels (or robot cores) that outperform the traditional CPU-centric dataflows in computational graphs by 2x, 10x or even up to 500x. However, the diversity of acceleration options makes it difficult for roboticists to easily deploy accelerated systems without expertise in each specific hardware platform. In this talk we’ll present ROBOTCORE Framework, an open architecture for hardware acceleration in ROS 2 which allows to easily switch across accelerators from various vendors through simple build-time flags.

An open architecture for Hardware Acceleration in ROS 2, the ROBOTCORE Framework

Motivation for hardware acceleration in robotics

Moving faster (or with more dexterity) requires faster perception computations. Atlas uses perception to identify, navigate and jump around obstacles. Edge perception is key to navigate under environments that change, which are most in the human world. Atlas leverages a time-of-flight depth camera to generate point clouds of the environment at 15 Hz. The point cloud is a large and dense collection of range measurements which needs to be processed at the edge (directly in the robot) for smooth behaviors, reducing any latency from the sensors and all the way into the actuators. By leveraging hardware acceleration, Atlas’ perception extracts surfaces from this point cloud which are then used to plan actions in the order of tenths of milliseconds. All happening on the edge.

One of the distinctive features of Boston Dynamics is the use of hardware acceleration. Hardware powers these dexterous movements, from sensing, going to perception and all the way into actuation.

0:00
/
Boston Dynamics' Atlas perception which a time-of-flight depth camera to generate point clouds of the environment at 15 Hz. The point cloud is a large and dense collection of range measurements which needs to be processed at the edge (directly in the robot) for smooth behaviors, reducing any latency from the sensors and all the way into the actuators. By leveraging hardware acceleration, Atlas’ perception extracts surfaces from this point cloud which are then used to plan actions in the order of tenths of milliseconds. All happening on the edge.

Accelerators for robotics applications

Roboticists have various accelerators available to speed-up their ROS computational graphs. The most common ones are depicted in the figure below using an industrial metaphore. There're others and we'll be tackling them in future work.

There's no single compute substrate that fits all applications, which is why accomodating hardware accelerating in the ROS ecosystem in a simple manner become relevant.

ROS 2 Hardware Acceleration Working Group

Using hardware acceleration requires a change in the way we think about robotics software development and the way we design and architect robotic systems. That’s why we created the ROS 2 Hardware Acceleration Working Group (HAWG), to drive the creation, maintenance and testing of acceleration kernels on top of open standards for optimized ROS 2 and Gazebo interactions over different compute substrates (including FPGAs, GPUs and other accelerators).

The computational graph of the figure below depicts the objective of the working group: accelerate the robotics dataflow in both nodes and graphs.

The Hardware Acceleration Working Group meets every month and packs together hundreds of hardware experts from all around the world.

The ROS 2 Hardware Acceleration Working Group drives the creation, maintenance and testing of acceleration kernels on top of open standards for optimized ROS 2 and Gazebo interactions over different compute substrates (including FPGAs, GPUs and other accelerators).

ROBOTCORE Framework

The ROBOTCORE Framework is a hardware acceleration framework for ROS that helps build custom compute architectures for robots through acceleration kernels, or robot cores, that make robots faster, more deterministic and power-efficient. Simply put, it provides a development, build and deployment experience for creating robot hardware and hardware acceleration kernels similar to the standard, non-accelerated ROS development flow.

It's an open architecture for Hardware Acceleration in ROS 2. Simply put, it provides a development, build and deployment experience for creating robot hardware and hardware accelerators similar to the standard, non-accelerated ROS development flow.

The framework introduces 4 major contributions into ROS 2:

  • build system (ament) extensions
  • build tools (colcon) extensions
  • a firmware layer to support multiple accelerators
  • a methodology for tracing and benchmarking fairly accelerators without relying on third-party tools
ROBOTCORE Framework
Public community-driven and open source implementation available at https://github.com/ros-acceleration

ROBOTCORE Framework integrates with ROS 2 through a series of contributions that meet and enhance community standards. The following subsections describe the major additions:

ROBOTCORE Framework: Production-grade multi-platform ROS support with Yocto (REP-2000)

Instead of relying on common development-oriented Linux distros (such as Ubuntu), our contributions to Yocto allow to build a customized Linux system for your use case with ROS, providing unmatched granularity, performance and security.

With ROBOTCORE Framework, we've contributed to REP-2000 by providing a port of the ROS 2 Yocto recipes for Humble (link to PR).

Production-grade multi-platform ROS support with Yocto (REP 2000 contribution ➦)

ROBOTCORE Framework: ROS 2 Hardware Acceleration Architecture and Conventions (REP-2008)

We are bringing the lessons learned while developing ROBOTCORE Framework into a REP (ROS Enhancement Proposal, community standards) that describes the architectural pillars and conventions required to introduce hardware acceleration in ROS 2 in a vendor-neutral, scalable and technology-agnostic manner. All while maintaining the common ROS development flow.

The purpose of this REP is thereby to provide standard guidelines on how to use hardware acceleration in combination with ROS 2. This REP does not aim to dictate policy on which frameworks or languages shall be used to create acceleration kernels. This decision is left to the architect building the acceleration. For example, the use of OpenCL, CUDA, Vulkan, HLS, Halide, Exo, etc. is beyond the scope of this REP

The value for stakeholders is three-fold:

  1. package maintainers can use these guidelines to integrate hardware acceleration capabilities in their ROS 2 packages in an accelerator-agnostic manner.
  2. Second: Consumers (ROS developers) can use the guidelines in the REP to set expectations on the hardware acceleration capabilities, ease of use and scalability that could be obtained from each vendor's hardware acceleration solution.
  3. Silicon vendors and hardware solution manufacturers can use these guidelines to connect their accelerators and firmware (including frameworks, tools and libraries) to the ROS 2 ecosystem while maintaining ROS 2 API-compatibility of Components and Nodes. By doing so, they will obtain direct support for hardware acceleration in all ROS 2 packages that support it.
We are bringing the lessons learned while developing ROBOTCORE Framework into REP-2008 that describes the architectural pillars and conventions required to introduce hardware acceleration in ROS 2 in a vendor-neutral, scalable and technology-agnostic manner. All while maintaining the common ROS development flow.

ROBOTCORE Framework: Benchmarking performance in ROS 2 (REP-2014)

Benchmarking is the act of running a computer program with a known workload to assess the program's relative performance. For robotics, we adopt a grey-box and non-functional benchmarking approach for hardware acceleration with a low-overhead tracing and benchmarking framework, and select the Linux Tracing Toolkit next generation (LTTng) to implement it.

In the context of ROS 2, performance information can help roboticists design more efficient robotic systems and select the right hardware for their robotic application. It can also help understand the trade-offs between different algorithms that implement the same capability, and help them choose the best approach for their use case. With REP-2014, we're proposing a standardized approach to do performance benchmarking in ROS 2 for hardware acceleration which allows comparing accelerators fairly, as well as acceleration kernels across robotics workloads.

Benchmarking is the act of running a computer program with a known workload to assess the program's relative performance. With REP-2014, we're proposing a standardized approach to do performance benchmarking in ROS 2 for hardware acceleration which allows comparing accelerators fairly, as well as acceleration kernels across robotics workloads.

Case Study: ROS 2 Perception graph

To demonstrate the capabilities of ROBOTCORE Framework, we presented an acceleration kernel built with it for robotics perception called ROBOTCORE Perception.

ROBOTCORE Perception is an optimized robotic perception stack built with ROBOTCORE Framework that leverages hardware acceleration to provide a speedup in your perception computations. API-compatible with the ROS 2 perception stack, ROBOTCORE Perception delivers high performance, real-time and reliability to your robots' perception.

ROBOTCORE Perception is an optimized robotic perception stack built with ROBOTCORE Framework that leverages hardware acceleration to provide a speedup in your perception computations. API-compatible with the ROS 2 perception stack, ROBOTCORE Perception delivers high performance, real-time and reliability to your robots' perception. 

Case Study: ROS 2 Perception Nodes

We now present measurements collected for individual ROS 2 Nodes across various accelerators. In particular, we measure the kernel runtime latency (ms). Measurements present the kernel runtime in milliseconds (ms) and discard ROS 2 message-passing infrastructure overhead and host-device (GPU or FPGA) data transfer overhead so that computations are more comparable.

Benchmarking ROS 2 Perception Nodes.

2022 Hardware Acceleration Report in Robotics

With all of this, we also presented the 2022 Hardware Acceleration Report in Robotics, which captures the state-of-the art of hardware acceleration in robotics by following a quantitative approach and presents robotic architects with a resource to consider while designing their robot computational architectures.

The report summarizes two major aspects:

  • a community survey conducted in both the ROS and the overall robotics communities helped grasp the interest behind the use of hardware acceleration in robotics.
  • Input from this community survey was then used to drive the second phase, a hardware acceleration benchmarking effort
2022 Hardware Acceleration Report in Robotics

Ongoing work on hardware acceleration in robotics

Lastly, we gave a sneak peak of various ongoing projects related to hardware acceleration in robotics:

Future robot IP cores

ROS 2 API compatible robot Intellectual Property (IP) cores.

Tools to speed-up ROS 2 graphs with the cloud, and in the cloud

ROBOTCORE Cloud, Tools to accelerate your robotic computations with/in the Cloud

Robotics MCU

A robotics microcontroller unit (MCU) powered by RISC-V and ROS 2:

Robotics MCU project ticket

ROBOTCORE, the Robotic Processing Unit specialized in ROS computations

ROBOTCORE® is a robot-specific processing unit that helps map Robot Operating System (ROS) computational graphs to its CPUs, GPU and FPGA efficiently to obtain best performance. It empowers robots with the ability to react faster, consume less power, and deliver additional real-time capabilities.

ROBOTCORE, the Robotic Processing Unit specialized in ROS computations
0:00
/
ROBOTCORE, the Robotic Processing Unit specialized in ROS computations

Final thoughts

Our session received hundreds of attendees. With more than 700 participants that flew from all the world into Kyoto and with higher business temperature than any other year before, ROSCon 2022 has been a complete success. Acceleration Robotics is a proud sponsor of ROSCon. Our sincerest thanks to Open Robotics for coordinating, organizing and leading us all to make ROSCon a success again.