Vision-based Obstacle Detection and Path Planning for the iRobot Create using an Android-powered Smartphone


Recent advances in mobile communication devices and cloud based technologies have expanded the realm of possibilities for personal robotic applications. This project integrates some of these technologies along with key components in robot navigation - object detection and path planning. The goal of this project is to plan optimal (or near-optimal) paths for the iRobot Create in an environment cluttered with obstacles using a smartphone. In this scope, optimal is defined as the shortest collision-free path between two points. The standard A-Star search algorithm is used to plan these paths. RGB thresholding and connected components labeling are used to detect and segment obstacles, the boundaries of the environment and the door posts. Finally a path following controller which runs on the smartphone drives the robot from its initial to final position.

Final Project Video


Obstacle detection and path planning are critical in many robotic applications. They are fundamental in pick and place tasks in industrial settings, in surveillance of indoor and outdoor environments, in robot soccer, in search and rescue operations and in several other related tasks. Embedded computing systems are typically used in autonomous robots for processing and control and usually involve integration of a variety of sensors such as cameras, IMUs, laser range finders, IR detectors etc. for the robotic system to be fully operational. This project proposes the use of an Android-powered smartphone, the HTC Evo, for control of a two wheel differential drive robot, the iRobot Create. The Evo is an ideal, low cost alternative as it integrates a linux-based operating system, a 1 GHz microprocessor (Snapdragon Scorpion), front-facing and rear-facing cameras, a loudspeaker and microphone, a 4.3 inch capacitive touch screen, Bluetooth and WiFi capabilities and a rich environment for developing applications in Java or python. One objective of this project is to create a foundation for using the smartphone in relevant robotics projects and to build on this foundation gradually until the full capabilities of the phone can be utilized in other robotics projects in the near future.

Related Work

An extensive amount of research has been carried out in obstacle detection and path planning as they such critical components of many robotics applications. The particular vision-based approach taken in this project is similar to the work done at the Jet Propulsion Laboratory by Snorrason et al on planetary rovers [1]. In this project, the Mars rovers take several stereo images of the surroundings, use histogram-based thresholding to detect obstacles, project the 3D data into an overhead 2D view obstacle map and then use a customized version of the A* search algorithm to produce optimal paths through the obstacle map. While we employ similar techniques in our respective projects and our goals are similar, the use of smartphones to carry out some or most of the operation is still very novel.


Image Processing: Obstacle Map Generation

The block diagram below illustrates the steps involved in generating the obstacle map needed for path planning.

Using an Android application called IP Webcam, the HTC Evo streams images of the different environment configurations over its WiFi connection to a remote server. A program captures the images and downloads them for processing.

 Config1    Config2      Config3      Config4      Config5

There are three main types of objects of interest that are segmented in the images above: the obstacles (the pink objects), the boundary markers (orange cones) and the door posts (green cylinders).

An algorithm was developed to search through the image for colored obstacles within a specific RGB range (RGB thresholding). The algorithm then creates a binary mask for each image where the detected obstacles are displayed as white and everything else masked as black (see figure below).

Figure 1: Input image: Config 1 --> RGB Thresholding to detect obstacles --> Image Mask.

A similar procedure was also carried out to detect the door posts and boundary markers in the scene.

Connected Components Labeling is then used to identify the major blobs within the image. The objects of interest are counted and their areas and centroids are computed. This technique is used to provide the location of each object in the image.

Figure 2: Plot showing location of obstacles in an image after applying connected components labeling.

This along with Canny Edge Detection is used to detect and compute the location of the door posts as well.

An orthographic projection is then performed on each mask to convert the third person view of the scene to a top-down view. With this technique, each pixel in the mask is converted to an x,y coordinate in the robot's world reference frame whose units are in centimeters. Using the location of the boundary markers, the top-down view map produced only consists of the region within the markers. The size and scale of this map is determined by specifying, in pixels, the size of the final obstacle map and also by identifying, in centimeters, the size of the area in the final obstacle map.

Path Planning using A*

Once an obstacle map is constructed, the configuration space is decomposed into uniform grid cells to obtain a discrete representation of the world.

This grid is then modeled as a four-connected graph where every node represents a location in the configuration space and each edge represents the distance between two locations.

Obstacles in the map are inflated by the radius of the Create so the Create is treated as a point robot. If an obstacle occupies any region of the cell, the cell is demarcated as occupied otherwise the cell is noted as “not occupied”.

The standard A* search algorithm is then used to search the graph for the shortest path from the initial robot position to a designated goal location. The Manhattan distance is used as the heuristic that estimates the cost to travel from the robot’s current location to the goal position.

Kinematics and Control

The iRobot Create is a standard two wheel differential drive robot. That is, its two drive wheels are mounted on a common axis and each wheel can be independently driven forward or backward.

Figure 3. Kinematics of a differential-drive robot, the iRobot Create. Each wheel is controlled by a different motor and can therefore be driven at different speeds.

The robot’s state is specified by its position p = (x,y) which is located along the axle of the differential drive and its orientation θ. L is the distance between the two wheels’ centers. vr and vl are the velocities of the left and right wheel. The state equations can therefore be written as [2,3]:

The velocities of the wheels are bounded such that:

The robot will move forward when vl = vr > 0, left when vr > vl > 0 and will spin in place when vl = -vr ≠ 0. Using the length of the robot’s axel and wheel radii, we can further derive 2 equations that provide the velocity of the robot with respect to its wheel velocities (see [3] for derivation):

r is the radius of the wheel, Øi is the radius of the ith wheel and d=L (the distance between the two wheels’ centers).

Path Following Controller for the iRobot Create

A path following controller is a control law that ensures that a reference point follows a designated trajectory at a desired speed [3]. In this project, the controller was designed to take the robot’s current position as input and produce the velocities needed to drive the robot along the desired trajectories.

Figure 4: Desired trajectory for the reference point P.

The controller is designed to control the position of the reference point P along a desired trajectory (xdes(t), ydes(t)). The point is a distance l from the axle (to prevent singularities) as shown in figure 2 and lies along the longitudinal axis.

The state equations are given by equations 1, 2, and 3 and the coordinates of the reference point P are:

The path is modeled as an implicit function γ(x,y) and the desired speed is computed from the robot’s position on the trajectory. The path error is approximated as the distance between the reference point and the curve and the speed error at any point (x,y) is given as a function of the velocity (see [3] for details). Given the desired path and using the path and speeds errors along with the integral of the speed error and the orientation of the robot as feedback, the controller computes the desired velocity needed to drive the Create along the planned path [3].

Results and Analysis

The following 5 figures are maps of the different configurations of the environment and the path planned by the A* search algorithm from start to finish. A four connected graph was searched by A* in configurations 1-2 and 4-5. An eight connected graph was used in configuration 3.

Configuration 1

Configuration 2

Configuration 3

Configuration 4

Configuration 5

In all five cases, an obstacle-free path was planned using A* and the robot was able to traverse each path successfully from start to finish in 30 seconds on average. Using an eight-connected graph provided shorter routes from the robot's initial position to the door posts as 45 degree connections between nodes were permitted in this type of graph. When used in other configurations, however, greater path/localization errors were introduced which caused the robot to veer off course to a larger extent. As such a four-connected graph was used in most of the configurations. The errors between the expected and actual trajectories were due to localization errors as integration of the robot's position over time resulted in incorrect estimates of the robot's location in the world. The controller was also not the smoothest and caused small jerks in the robot's movements.


One of the challenges I encountered during image processing was detecting the obstacles in the scene using RGB Thresholding. RGB values are very sensitive to lighting conditions and also vary with the distance or angle at which an image is taken. I tried to alleviate this problem by capture images of the different environments under similar lighting conditions and from the same viewpoint.

Obtaining precise localization estimates and smooth control of the robot as it traversed the different paths were also challenging. This was due to small errors in the computations of the robot's location relative to the obstacles in the environment initially as well as inaccuracies in the transitions from one way point to the next along the trajectories.

Future Work

A lot of improvements can be made to make path planning and path traversal more accurate and optimal. One approach that could be taken is to integrate local and global path planning. This would involve planning a path initially with similar techniques as outlined in this project and also include methods for local obstacle detection and path planning as the robot traverses the initially planned path. Using markers for localization during path traversal would also increase accuracy in the robot's estimates of its position which could prevent collisions as well.

With more accurate localization, paths could also be optimized by using eight or even sixteen connected graphs while running the A* search algorithm. Other variations of A* that would eliminate unnecessary way points and compute shorter paths in the graph could also be investigated.

Another improvement would be to add online vision processing (on the HTC Evo). Depending on the height of the camera and whether it is placed on the robot itself, this could mean taking a local planning approach but this would take advantage of the phone's rich application development environment and advanced image processing packages that are available for android-based phones such as OpenCV. This would make the system itself more versatile and robust.


In this project, I implemented a color-based algorithm for detecting obstacles, markers and door posts in a scene. After segmenting objects of interest in the environment, I then ran an A* search algorithm to find sub-optimal paths from start to finish. An android-powered phone was used to run the path following controller that drove the iRobot Create along the paths produced. These techniques proved to be reliable and successful in five different configurations of the environment. Additional research and improvements to the existing system should take advantage of the phone's image processing and enable all if not most of the processes to run on the phone successfully.


[1] M. Snorrason, J. Norris, P. Backes. Vision based obstacle detection and path planning for planetary rovers. Proceedings of SPIE, Vol. 3693. Presented at 13th Annual AeroSense, Orlando, FL, April 1999.

[2] J. Snape, J. Berg, S. Guy, D. Manocha. The Hybrid Reciprocal Velocity Obstacle. IEEE Transations on Robotics. Available at

[3] V. Kumar. Kinematics and Control of Wheeled Robots. Available at

Comments on Final Project Presentations

Anirudha Majumdar

I liked the fact that you explored underarticulated robots for your project since I hadn't considered them before. Highlighting the advantages and disadvantages of using these robots and comparing them to articulated robots like the Asimo gave me insight into the relevant applications for these robots. The fact that these 2DOF robots could play soccer was really cool. I would want to learn more about the dynamics of underarticulated robots with more degrees of freedom and see examples of practical applications.

Avik De

I appreciated the fact that you took a fresh approach to finding locally optimal paths. Your explanation of the theory was clear and sufficiently detailed.

Ben Charrow

Your demo was really cool and the application itself (using the PR2 to open doors) is very practical and useful. Great job on improving Willow Garage's code base on autonomous door opening and making it more general. I look forward to seeing door handle identification integrated into the project.

Brian MacAllister

Your simulation was really cool. I appreciated learning about another form of the A* algorithm that works well in higher dimensions.

Caitlin Powers

I liked the fact that you could estimate trajectories during control of the manipulator instead of before usage thus saving time. It would be great to see your implementation in action. How accurate do the estimates have to be in order for the manipulator to perform its tasks?

Cem Karan

I like the practical implications/applications of your project. I look forward to learning more about this probabilistic model. Why did you choose this method over others?

Ceridwen Magee

I think its really cool that you can apply the planning algorithms we've used in macroscopic applications on a microscopic level. I look forward to seeing how this is actually used in the biological systems you're studying.

Chinwendu Enyioha

I liked the concept of your project and its application to communications and robotic networks. I didn't get a clear understanding of all the details so I am looking forward to learning more from the final report.

Fei Mao

Thanks for the insight you gave into saving search time and space in matlab through the use of structs. I also appreciated learning more about the RRT Algorithm and how it compares to A*.

Ian McMahon

Your project has a lot of practical applications as the PR2 and similar robots have to do a lot of planning and coordination when manipulating objects in their environments. I'd like to see more details about the planning in the final report.

Jason Owens

I like the practical application of the project especially since perception/feature detection is so vital to robotics. I'd like to see how your approach compares to an approach where learning is involved and how the results would change when you train a network to recognize the backpack from various angles

Kartik Mohta

It was cool to see an implementation of voronoi tesselation in your project. I would like to see how this approach compares to others for such an application.

Menglong Zhu

Enabling the PR2 to read is very cool and practical. Awesome vids. I am curious to learn more about the various algorithms used in OCR and the details of your implementation.

Monica Lui

Your project has useful applications. I want to learn more about your implementation.

Shuai Li

Cool project. A demo/simulation would be great. Looking forward to final report.

Steve McGill

Very cool project! Getting robots to perform synchronized tasks such as lifting a table together is very useful and extensible to other areas of robotics. It would be interesting to see the robots handle objects of different shapes, weights and configurations.

Yida Zhang

Cool demo. Introducing the dynamic aspect through the use of busy man was not only funny but also practical. Looking forward to the final report/vidz.