ROS 2 Hardware Acceleration Working Group

ROS 2 Hardware Acceleration Working Group - meeting #7

Source: ROS Discourse, March 31, 2022 [1]
We discuss the group's progress during the last month showing how ROS 2 perception pipelines can run in GPUs and FPGAs, contributing firmware layers for NVIDIA boards and comparing results with other hardware accelerators. We also review the methodology for ROS 2 Hardware Acceleration and REP-2008 and then discuss a new subproject of the group: the Robotic Processing Unit (RPU). The session will also includes a guest talk provided by Hyunjong Choi who tells us about how although ROS 2 claims to enhance the real-time capability, ensuring predictable end-to-end chain latency still remains a challenging problem. To address it, their group proposes a new priority-driven chain-aware scheduler for the ROS 2 wherein callbacks are prioritized based on the given timing requirements of the corresponding chains so that the end-to-end latency of critical chains can be improved with a predictable bound.


Hardware accelerated ROS 2 pipelines

Source: ROS Discourse, March 29, 2022 [2][3]
The ROS2 Hardware Acceleration WG has been driving the creation of open hardware acceleration solutions for ROS2. We proposed an open architecture and conventions for hardware acceleration extending ROS2 core layers, demonstrated 5-10x speedups for individual ROS Nodes and Components, and recently showed how to accelerate ROS2 perception pipelines by increasing the perception throughput with simple graphs.

Towards the Robotic Processing Unit (RPU)

Source: ROS Discourse, March 29, 2022 [4][3:1]
The Robotic Processing Unit (RPU) are robot-specific processors that map ROS computational graphs efficiently to CPUs, FPGAs and GPUs to obtain best peformance. The vision is that RPUs will empower robots with the ability to react faster, consume less power, and deliver additional real-time capabilities.

Arm sees support path to heterogeneous compute

Source: The Register, March 29, 2022 [5]
Arm says heterogeneous compute architectures – those with a mix of CPUs, GPUs, DPUs, and other processor types – pose a challenge for software developers, and greater multi-architecture support is needed to address this.

Researchers From Tsinghua University Propose ‘Stochastic Scheduled SAM’ (SS-SAM): A Novel And Cost-Efficient Training Scheme For Deep Neural Networks

Source: MarkTechPost, March 27, 2022 [6]
Deep Neural Networks (DNNs) have excelled at solving complex real-world problems. Tsinghua University's research team proposes novel and effective DNN training strategy. Optimizers would run a Bernoulli trial using a scheduling function at each step in the SS-SAM method.

Hardware Mitigation on Intel, Arm, and AMD CPUs Shown Ineffective against Spectre v2

Source: InfoQ, March 28, 2022 [7]
Security researchers from Vrije Universiteit Amsterdam showed the hardware mitigations to Spectre v2 attacks implemented in both Intel and Arm processors have fundamental flaws that make them vulnerable to branch history injection.

A new approach to tackle optimization problems using Boltzmann machines

Source: Tech Xplore, March 28, 2022 [8]
Ising machines are promising tools for solving combinatorial optimization problems and creating artificial models of the brain. A team of researchers from the University of California, Berkeley has explored potential of Ising machines in great depth. Their most recent paper introduced a new Ising machine comprised of many restricted Boltzmann machines (RBMs). Ising machines could be used to solve a wide range of complex real-world problems more rapidly and efficiently.

New Platform Lets GPUs and FPGAs Use Intel Optane Memory Modules

Source: Tom's Hardware, March 24, 2022 [9]
Intel, Liqid, MemVerge, and others have developed a platform solution. It allows to pool system memory and storage-class memory (SCM) like Intel Optane Persistent Memory. System memory can be used by CPUs, FPGAs, and other accelerators. Composable big memory solutions are an important part of an overall CDI architecture.


Mimicking the Five Senses, On Chip with Rob Telson

Source: Robohub, March 23, 2022 [10]
Machine Learning at the edge is gaining steam. BrainChip is accelerating this with their Akida architecture, which is mimicking the human brain by incorporating the 5 human senses on a machine learning-enabled chip.
Their chips will let roboticists and IoT developers run ML on device for low latency, low power, and low-cost machine learning-enabled products. This opens up a new product category where everyday devices can affordably become smart devices.

Previous Hardware Acceleration in Robotics Newsletters

Past ROS 2 Hardware Acceleration Working Group meetings

  1. Hardware Acceleration WG, meeting #7. ↩︎

  2. Acceleration Robotics. Hardware accelerated ROS 2 pipelines. ↩︎

  3. Mayoral, V. (2022, March 29). Hardware accelerated ROS 2 pipelines and towards the robotic processing unit (RPU). ROS Discourse. ↩︎ ↩︎

  4. Acceleration Robotics. Robotic Processing Unit (RPU). ↩︎

  5. Robinson, D. (2022, March 29). Arm says devs need better multi-architecture support. The Register: Enterprise Technology News and Analysis. ↩︎

  6. Kumari, A. (2022, March 27). Researchers from Tsinghua University propose 'Stochastic scheduled Sam' (SS-Sam): A novel and cost-efficient training scheme for deep neural networks. MarkTechPost. ↩︎

  7. Simone, S. (2022, March 28). Hardware mitigation on Intel, arm, and AMD CPUs shown ineffective against spectre v2. InfoQ. ↩︎

  8. Fadelli, I. (2022, March 28). A new approach to tackle optimization problems using Boltzmann machines. Tech Xplore - Technology and Engineering news. ↩︎

  9. Shilov, A. (2022, March 24). New platform lets GPUs and FPGAs use Intel Optane memory modules Video. Tom's Hardware. ↩︎

  10. Mey, A. (2022, March 23). Mimicking the five senses, on chip. Robohub - Connecting the robotics community to the world. ↩︎