SLAM without a PhD

Article By : Owen Nicholson

行业领导者都使用低功率,嵌入式处理器将大满贯技术集成到其AR/VR耳机中。机器人开发人员为什么不能这样做?

机器人有可能帮助解决当今许多挑战,从关键行业的工人短缺到应对气候变化。当今的机器人行业正在迅速发展以应对这些挑战,但仍然最好地表示为成千上万的利基垂直行业,所有目标都具有不同的目标,技术和商业要求。为了真正起飞它,必须从孤立的开发加速到供应链模型,该模型看到专家为SLAM等特定挑战提供了解决方案。


(Source: SLAMcore)

(大满贯)是一个同步定位和映射fundamental requirement for autonomous robots. It is the complex processing of data from multiple sensors that allows an autonomous system to estimate its position within an environment. It is also one of the most challenging aspects of creating robots that are able to navigate themselves. Whilst the past few years have seen an explosion in mobile robots able to move around factories, warehouses, hospitals and even shopping malls and private homes, most still have limited true autonomy. There are three main reasons for this.

  • First, building the algorithms that allow a robot to estimate where it is using only on-board sensors is incredibly difficult. There are a handful of PhDs in the world’s leading research establishments and universities who have dedicated their entire careers to solving these issues. Even they are often stumped by things as simple as changes in light or when objects such as palettes are moved to different positions.
  • Secondly, every robot is different. Each combination of sensors and processors must be calibrated, tested and optimized. Different hardware layouts, functions and use-cases all have different SLAM requirements each of which are normally tackled as a bespoke project. Every time a new sensor is added, or a new edge-scenario encountered, it’s back to the SLAM drawing-board to recalibrate and re-test.
  • 最后,所有这些密集的计算都必须在移动机器的限制内完成 - 通常是一个相对较小的计算。功耗,体重和处理效率始终至关重要。在许多情况下,成本形式的进一步限制,包括所需的传感器和系统范围的材料清单,这意味着有必要使用低成本,现成的组件实现大满贯。

Three questions that define autonomy

To be truly autonomous, a robot must be able to answer three fundamental questions:

  1. 我在哪里?
  2. What is the shape of the world around me?
  3. What are the different objects in that world?

对这些问题中的任何一个的答案不正确导致绝大多数机器人故障。在大多数情况下,它们只是冻结,并且有无数的故事,关于人类需要介入以重新定位“自动”机器人已经在当今环境中工作的。


空间智能的金字塔(来源:抨击)

Fully autonomous robots will answer these three questions as part of a spatial intelligence pyramid. At the base they will know their position, they will then build effective maps and finally perceive the difference between different objects in those maps.

The final stage not only builds on the previous two but helps improve them. Robots that can identify an object as a chair or table, rather than a wall, can ignore them as long-term landmarks (because they may move, they are not useful for positioning). This reduces the number of potential landmarks that need to be considered, lowering processing overheads. Of course, people and other important elements of a scene can also be identified and robots programmed to act in specific ways around them.

不同的传感器 - 不同的数据

The complexity of SLAM starts with the need to analyze data from numerous different sensors which provide information about the space the robot is in. There are many types of sensor each of which has its own pros and cons and designers rarely rely on just one. For example, LiDAR is a popular type of sensor for robots – perhaps the most commonly used currently. It uses lasers to calculate the distance between a robot and other objects. It is fast, accurate and reliable. However, most LiDAR’s used today provide only a thin ‘slice’ map of the environment so several might be used or augmented with other sensors including odometers, gyroscopes and accelerometers.
随着每个新传感器的添加,还会带来额外的成本和额外的复杂性。每个新的数据供稿都必须集成和校准。更多的数据意味着要处理更多的工作。反过来,这意味着更强大的处理器消耗更多的能量,因此需要更大的电池,突然间,整个机器人设计再次发生了变化。

Learn from nature

A different approach is needed. One that delivers the benefits of robust and accurate SLAM even to those designers that don’t have access to world-class experts with PhDs in spatial intelligence.


(Source: SLAMcore)

The key to this approach is a visual SLAM solution optimized for the most commonly used sensors and processors. Most animals constantly calculate their position, map the world around them and understand what objects are using just two types of sensor – their eyes and inner ears. Visual Inertial SLAM systems do the same. With two simple cameras – like those found in most smartphones, and an IMU – an inertial measurement unit that tracks orientation and acceleration, these systems provide cost effective yet accurate positioning, mapping and perception for robots.

包括Microsoft,Google和Facebook在内的行业领导者都将这项技术集成到其AR/VR耳机中,以供消费者和业务使用。他们使用低功率,嵌入式处理器提供轻巧,具有成本效益的可穿戴解决方案,这些解决方案使用SLAM功能为用户创造准确的沉浸世界。那么,机器人开发人员为什么不能这样做呢?

优化挑战

The answer lies in the way these systems are designed and the economics of their production. The hardware, sensors and algorithms are all heavily optimized to work together. Absolute accuracy of timing is essential for SLAM estimations to be precise. Data feeds from the cameras and inertial measurements from the IMU are tightly integrated and time stamped to the millisecond fusing the sensor data into a consistent data stream for the algorithms to process.

This high level of optimization means that the SLAM software will only work with that exact combination of hardware. Porting software from one of these products to work on another would not just yield substandard performance – it would not work at all. Whilst this is fine when you plan to manufacture millions of relatively low-cost devices all doing the same thing – it makes no sense for a typical robot developer who has specific hardware requirements but is only looking to sell in the low thousands at best.

But Visual Inertial SLAM does still have a bright future in the robotics industry. As mentioned, the sensors are low-cost and easy to source. Cameras also provide masses of useful spatial information. In fact, you can get more data for SLAM from a single 1-megapixel VGA camera costing less than one dollar than from a top of the range LiDAR costing thousands of dollars. The challenge is to process this data reliably in real-time without using excessive computing power or energy.

解决方案是部署专家博士,以创建最前沿的视觉惯性大满贯软件,该软件可以无缝地插入各种机器人的自主堆栈中。通过开发高效的算法,可以加速使用标准传感器和低成本,低功率,嵌入式处理器提供高度准确的结果,可以加速整个SLAM有效性的旅程。

But, if as noted above, the optimization of software and hardware is essential for good results, how can you create algorithms useful for all types of robot with different sensors? One solution is to optimize algorithms for a selection of popular and easily available hardware options. For example, X86 processors as well as the Jetson range from NVIDIA plus Intel RealSense Depth cameras D435i and D455. Optimizing for these most commonly used and highly regarded components in the industry will allow the majority of developers and designers to quickly integrate effective SLAM into their robot prototypes. This range can then be extended over time as more hardware options become commonly available.

三个级别的编码

视觉大满贯算法通过选择相机捕获的场景中可见的许多自然特征来创建环境的概率模型来起作用。这些功能彼此和机器人的相对位置即使使用单个相机也可以计算。随着机器人和相机的移动,从新角度可以看到相同的确定功能。使用视差原理,可以使用两种视图之间的差异来计算距离。这个单一摄像机大满贯原则首先是由Slamcore创始人Andrew Davison教授在2003年的实时系统中展示的MONOSLAM.

click for full size image

Architecture of a feature based visual-inertial SLAM algorithm (Source: SLAMcore)

Detecting specific features that are suitable for calculating position is at the heart of effective SLAM. That means transforming the rich and heavy data streams from the cameras into something which can be quickly processed with even low-end processors. ‘Feature detection’ is an exercise in dimensionality reduction, going from millions of pixels to just hundreds of points useful for locating the robot. Feature detection is performed on every pixel in a scene, but the computation can be parallelized. Two of the world’s leaders in the area, SLAMcore co-founder Dr Leutenegger and CTO Dr Alcantarilla – authors of two of the most popular open-source feature detectors, BRISK and AKAZE, have developed SLAMcore’s capabilities in this area.

Detecting the right features and positioning them accurately allows the creation of a sparse, point-cloud map of a robot’s surroundings. The highly efficient process means that maps of spaces from living-room to warehouse scale can be created with cm-level accuracy. These sparse maps are the foundation of the spatial intelligence pyramid allowing robots to accurately estimate their position in real-time. The efficiency of these algorithms means that with two cameras and an IMU these SLAM position maps can be processed on a Raspberry Pi.

在机器人中部署大满贯也应该快速简便。例如,开发人员可以从使用单个命令启动的简单库中启动Slamcore的核心定位算法(如下)。只要插入兼容的传感器,就需要打开终端并键入以下内容:

user@ubuntu:~$ slamcore_visualiser

click for full size image

SLAMcore Visualiser – 3D View Features (Source: SLAMcore)

有成千上万的hyper-parameters相关with a SLAM system. From the number of features that should be detected in each frame to the distance at which they start to be rejected. Each parameter can be adjusted to tune the performance of this complex system. Instead of forcing developers to wade through lines of source code and adjust parameters through trial and error, simple presets corresponding to specific planned use-cases can be used. For example, warehouse, office, drone, wheeled robot, indoor, outdoor, high accuracy, high speed. These presets can be selected just by adding an extra command:

user@ubuntu:~$ slamcore_visualiser -c ~/preset_file.json

click for full size image

Depth image creation and 3D mapping (Source: SLAMcore)

Building on these maps, using the data from the same sensors, a depth image is created. This is similar to a regular image but instead of each pixel representing colour, it represents the distance away from the camera. Algorithms then combine this information with the position estimate previously described to create rich dense maps.

As with positioning algorithms, mapping algorithms are also provided as a simple library launched by adding a simple text flag to the launch command. With a compatible sensor plugged in, open the terminal and type the following:

user@ubuntu:~$ slamcore_visualiser -m 1

click for full size image

猛击:地图2.5D(来源:猛击)
click for full size image

SLAMcore:Map 3D (Source: SLAMcore)

2.5D地图显示了大量的空间以及高度,这些高度建立在机器人周围的世界上更详细的地图中。它们指示机器人可以在哪里移动,哪个空间被占据到什么高度。根据可用的时间和处理能力,还可以生成带有详细信息水平的3D地图。可以创建,保存并上传到机器人的详细3D地图,然后使用更快,稀疏的地图在这些环境中实时定位。

Essential for industry’s progress

Optimized algorithms that work out-of-the-box with the most popular hardware combinations will dramatically reduce the barriers to entry for those looking to integrate visual inertial SLAM into their autonomous robots. Accessing these, developers will free up time and resources to focus on the applications and functions that make their robots different and useful, rather than repeating the trial and error process just to get them to accurately position themselves and create maps allowing them to get from A-to-B. They will get their robots to market faster and provide more cost-effective solutions.

The future is bright for robotics. By helping solve the complex challenges of SLAM and democratizing access to some of the most cutting edge research, practical application and world-leading PhDs in the field, we hope to bring this future closer for everyone.

This article was originally published onEmbedded.

欧文·尼科尔森是创始人兼首席执行官SLAMcore. Nicholson’s early career saw him managing research and development projects for government and commercial organisations which ultimately led him to lead commercialisation at the Robotic Vision Lab, Imperial College London. Working alongside genuine world-leaders in the field of machine vision, Nicholson helped to transition leading edge academic research into applications that could quickly deliver benefits to the wider world. Seeing the potential for SLAM in autonomous robots, he founded SLAMcore in 2016 with the goal of democratising access to cutting edge, visual SLAM technologies.

Leave a comment