This paper gives, overview of an Autonomous Mobile Robot?s (AMR) vision system. It explains how AMR captures an image through vision sensors and captured image is represented in various color spaces. These sensors are charge coupled device cameras and complementary mosfet cameras (CCD and CMOS).
After acquiring an image through these cameras, the representation of image is done. These include Red Green Blue (RGB), Hue Saturation Value color space (HSV), Luminance Chrominance blue Chrominance red color space (YCbCr). After acquiring an image and after its representation in respective color space, depending on application, object is detected through various algorithms.
During detection of object, its edges or corners are detected to separate it out from an image. Canny?s edge detection and Harris corner detection algorithms are generally used for autonomous mobile robots. Features from accelerated segments tests (FAST) is also a corner detection algorithm.
The computing power of FAST algorithm is faster than other algorithms, hence used for real time applications of AMRs.
As edges and corners are susceptible for rotation and scale variance respectively, some local features which shows invariance to these properties are chosen and then detected. Object recognition is done based on these local features using Scale Invariant Feature Transform (SIFT).
Autonomous Mobile Robot, is the robot which is able to navigate thr ough environment autonomously while performing goal-oriented tasks (Berns, 2019). Now a days, these robots are gaining importance in every field from industrial area to personal areas.
For AMR?s as the degree of autonomy of tasks execution increases, the degree of the unstructuredness of environment also increases.
As the unstructuredness of environment increases it becomes more challenging for robots to safely navigate through environment without damaging people or surrounding as well as to robot itself.
Figure 1. Classification of AMR- systems. (Berns, 2019)
To overcome this challenge, various sensors systems and corresponding signal processing system are installed on robots. These sensor systems collect external and internal states of robot?s environment and helps robot to perform designated tasks without damaging surrounding, human or robot. These sensors can be classified as distance, position, vision, lasers. Out of these vision sensors are inexpensive and efficient to use.
As mobile robots also work in aware environment, where human interaction is unavoidable, AMR?s vision system must be precise and accurate. The system?s performance or accuracy depends on overall performance of every unit of vision system.
If we consider the example of autonomous guided vehicles which are part of autonomous transportation in industry. These robots transport goods from one place to another through aware environment. This environment consists of robot?s trajectories, human, machinery as well as other guided vehicles.
In this scenario, robot should follow trajectory, capable of identifying starting point, destination and goods to transport. Along with these things it should recognize objects in moving environment, as human, machinery, other guided vehicles. After recognizing all these objects, it should act according. For example, human is recognized in a frame then it should stop or take another trajectory; after identifying destination
it should drop goods, if another guided vehicle comes close then it should stop or send stop signal to another vehicle.
To achieve this goal, robot?s vision system should be efficient, accurate and cost efficient. Depending upon requirements of application, various components along with various detection methods (edge or corner), respective algorithms must be chosen, such that above conditions get fulfilled.
The future for robot vision is vast and challenging. Before moving towards the future, understanding basic concepts of robot?s vision system will help to build pillars of this system for future advancements.
The vision system of AMR’s consists of vision sensors and digital image processing unit. Vision sensors are the cameras, which capture the scene in the form of an image. This image is provided as input to the digital image processing unit. Based on the output, actuators receive signals and act according to the situation.
Vision sensors can be considered as eyes of AMR. Vision sensors are cameras installed on AMR. The main task of these cameras is, to continually capture images as AMR moves in the environment. This is a time-dependent environment in other words aware environment.
In the aware environment, where human intervention is present, this task needs to be carried out error-free, to ensure safety. These sensors are exteroceptive and passive in nature. Meaning using these sensors external environment surrounded by the robot is analyzed, and no input is provided to obtain the output. There are two types of vision sensors used for AMR’s. They are
Charge coupled devices (CCD) camera. CCD is a monolithic device. It is made up from semiconductor material – Silicon. A few Square centimeter areas on CCD element can represent 576 *487 pixels (Silva, 2016).
The CCD camera is built with several photodiodes. When photodiodes are exposed to photon (light), the charge is induced in these diodes.
This charge depends on the intensity of several photons. Electrical signals are generated corresponding to charge intensity. Reading of these electrical signals, i.e., transferring of charge from photodiode to analog to digital converters, is done sequentially. Each photodiode can represent one pixel or combination of neighbor pixels.
Figure 2. Working Principle of CCD Camera (Berns,2019)
CMOS camera. In CMOS camera, photodiodes are connected in series with a resistor and continuous conversion of photon beam into an electrical signal (voltage) is done. In CMOS camera, each photodiode is individually accessed and hence charge conversion to voltage is done parallelly, which in turn increases the speed of processing. (Berns,2019)
Figure 3. Working Principle of CMOS Camera (Berns,2019)
The image obtained through the camera is serve as input to object detection algorithms. Before further processing, images are preprocessed. Color images can be represented in various forms depending upon the application. This representation of images is called as color spaces. These are RGB color space, HSV color space, YCbCr color space, LAB color space, grayscale image.
RGB color space. RGB color space consists of a red, green, blue combination of colors. The human visual system works similar to this color model. Hence this color space is mostly used in computer vision. This color space has strongly correlated channels and is non-perceptual in nature.
Figure 4. Original Image with RGB Filters (Berns, 2019).
HSV color space. HSV color model appears as a cone of colors. The components of this model are hue, saturation, and value. Hue is the part of the color, indicating numbers from 0 to 360 degree. It starts with red (0 degrees) and ends with magenta (360).
Saturation represents the percentage of gray color from 0 to 100 percent in particular color. By increasing saturation gray color in particular increases and hence faded color is produced. The saturation scale also represents 0 to 1, where 0 is only gray and 1 being the only color.
The value indicates the brightness. Value, along with saturation, indicates the color intensity. The range of value varies from 0 to 100 percent, where 0 being black and 100 percent being full color. (Bear, 2019)
Figure 5. HSV Color Model (Berns, 2019)
YCbCr color space. The components of this model are luminance (Y), chrominance blue (Cb), and chrominance red (Cr). This color model is fast to compute, can be compressed and therefore used in TV.
Luminance is sensitive to human eyes, but chrominance blue and chrominance red are not sensitive to eyes. While compression chrominance details can be removed, as it does not contain much information concerning human perception.
Figure 6. YCbCr Color Space (Berns, 2019)
Lab color space. This color space is independent of the device. Model of the lab is a three axes system and color of the object in this model is measured in a spectrophotometer. The lightness (L) axis is a vertical axis, which indicates white (+L) to black (-L). A axis runs horizontally with cyan color (-a) to red color (+a). B axis also runs horizontally with blue color (-b) to yellow color (+b).
Figure 7. Lab color model. (Berns, 2019)
After acquiring an image, analysis of the image is carried out. In this analysis, objects which are present in the scene are detected and recognized. Detection of objects means separating the interesting object from the rest of the image. For example, an industrial carrier robot captures a scene, in which, a person, its trajectory, its destination along with machinery are included.
Following this situation, a robot needs to separate a person from machinery, a trajectory from other routes, destination point apart from other robot’s destination points. So, objects can be divided into a person, machinery, trajectory, and destination point.
As well as recognition of these objects into its respective classes is done in the recognition phase. Classes, classified as person class, machinery class, route class, destination point class.
Objects can be detected based on various features. These features include geometric (size, shape), visual (color, texture, image features), physical (weight, temperature, motion), acoustic (noise, acoustic pattern), chemical (emission) (Berns,2019).
Visual features are detected using images captured by cameras (CCD and CMOS).
Visual features. Visual features can be extracted from images. Extracting visual features is fast and cheap. As these features contain much information, large data is available to process. These visual features include
An object can be detected by detection of object’s edges, by detection of corners of object or detection of the special invariant visual feature of the object.
Object detection is based on two fundamental properties of the image. These are a discontinuity of intensity values and similarities of intensity values (Gonzalez & Woods, 2002).
Edge detection of object. Detection of edges of an object is based on the discontinuity of intensity values in an image. An edge can be defined as a boundary formed by a set of connected pixels between two regions. The shape information of an object is identified through edge detection of an object.
Edges can be caused by various factors such as surface normal discontinuity, depth discontinuity, surface color discontinuity, illumination discontinuity. These discontinuities can be well understood from the following figure.
Figure 8. Causes of discontinuities leading to Edge (Berns, 2019)
Detection of edges can be done with various methods such as thresholding gradient of smoothed image, by Marr-Hildreth algorithm or Canny edge detection algorithm. Out of these methods, Canny’s Method is mostly used for AMRs.
Canny edge detector algorithm. The performance of this algorithm is superior to other edge detection methods. This method is, but results are worth of such complexity. The objectives of Canny’s method can be summarized as follows:
👋 Hi! I’m your smart assistant Amy!
Don’t know where to start? Type your requirements and I’ll connect you to an academic expert within 3 minutes.get help with your assignment