Real-Time Obstacle Detection for Humanoid Robot with Single Camera


The goal of this project is to detect obstacles just based on single RGB camera for humanoid robot. Considering the limited computation capacity of onboard Fit-PC. I first apply pre-made colormap to label initial images and and then use connected component algorithm to segment obstacle patches. For each obstacle in sight view, color counting has been applied to decide the geometry information, such as centroid and axis, which then been used for size estimation. Finally, real-time obstacle detection algorithm has been demonstrated on DARwIn Robot.

Introduction and Related Work

Obstacle detection is essential for localization and motion planning on humanoid robot. In this project, We consider DARwIn Humanoid Robot with application to soccer game. In Robocup Humanoid League, three robots are supposed to form a team and attack opposite goal post with three defenders from other team. Pushing is not allowed in that game that our robot need to know where the competitors are and figure out the available way to get the ball and shoot and target.

However, DARwIn Robot only has one RGB camera as the detecting sensor and the objects on video streaming are supposed to be distorted due to walking and vision projection. In this case, color labeling seems to be the only method for distinguishing obstacles from targets and fields. In this project, We only consider rectangular obstacles with identical size and color.

Obstacle detection as an significant branch on Robotics, has been researched for several year using different algorithms [1] [2]. However, all these researches are based on laser sensor or Kinect, which are not allowed in Robocup Humanoid League. Also both algorithms require quite amount of time on computation and cannot be realized online on DARwIn Robot.

In Robocup Humanoid League, German Teams have dominated the game for several years. Their obstacle detection algorithm [3] is based on the grid segmentation of sight view. After labeling each video frame, they separate frame into several single color squares and then estimate size and position of obstacles according head angles. Their work is a excellent reference for my project, but did not consider the situation with crowded obstacles.


1. Video Frames Capture
Since all detection approaches in this project are color-based, one consistance camera input is required before all other steps. Among all camera parameters, exposure and white balance temperature were turned out to be most significant. By average luminance (Y) value observation, manual exposure and white balance temperature 1200 and 2000 has been chosen for best color recognition.

Figure.1 Typical Project Environment

2. Color Labeling
Every object in this project has a single color. I first took several video logs of all objects and then manual selected every object in log frames using Matlab. In this step, gloss on obstacles (pink patch) must be taken into consideration, otherwise, obstacle can not be detected completely. In this project, green has been assigned for carpet, blue for obstacles and yellow for target (Goal Post). After that, one complete colormap would be generated and transferred to Robot for Segmentation. This part of work used colortable tools already available in UPennalizers Code Base.

In Robot Vision System, colors was labeling real time using the local generated colormap file as shown in Figure.2. However, frame size was reduced from original 640x480 to 320x240 for processing speed.

Figure.2 : RGB Video Frame

*Captured From Modified UPenn Robot Monitor. Size reduced to 80x60 for Monitoring

3. Segmentation
Connected component algorithm was applied here to segment obstacles and targets from freespace. This part of work was also based on Image Processing Library available in the Code base, but a modified Connected Component was created to deal multiple segments. To simplified computation, only first 5 largest segments in one frame would be consider as obstacles. Since colors became quite simple at this time, all centers of obstacles could be obtained straight-forwardly.

4. Color Pixel Counting
For every segment, since we already got the center position, four vertices would be found by counting pixels. From the center of the obstacle segment, we scanned every row and column to find the extreme values for all sides. At the meantime, two diagonal lies can be obtained as well as rotation and distortion information.

5. Obstacle Projection
Currently, all obstacles information we got was in Camera Coordinate. From simple geometry, we might know that, what camera saw was actually a sector of actual environment. When head pitch angle became close to 90 degrees, the sector was approximate to a cone.

Since Robot geometry and joint angles were known from sensors and previous experiments, obstacles were projected into Robot Coordinate based on simple geometry. On the other hand, the distances from the obstacles to Robot were estimated by comparing obstacle pixel size to their actual sizes. One problem I met here was that Humanoid Robot would shake a lot while walking. Thus the estimated Z value of obstacles would frequently under zeros. In this case, height check was modified for obstacles such that 'flying' obstacles would be ignored.

6. Free Space Estimation
After Step 5, general obstacles map was obtained in Robot Coordinate. However, due Robot Shaking and computation error, real time obstacles map would be seriously overlapped such that no path could be found there. In this project, I separated every video frame into three columns, and calculated the percentage of free space in each of them. The center column would decide the available path ahead. If one obstacle was too close, the Robot would stop and look around to figure out available path to go. The left and right columns were used for directions. The logic was that Robot would always turn to the side that had more free space. In this case, obstacles information was used to vertify the quality of free space. For instance, if small parts of several obstacles were equally distributed, this side was worse than the other side with only one large obstacle in top of the column, though it had larger free space percentage.

7. Motion Planning and Robot Control
Based on obstacle, free space and direction information got from previous steps, Robot would always walk towards target through available path. DARwIn Humanoid Robot was totally velocity based control, but we can hard get current velocity from sensors. Therefore, vision is only feedback for closed-loop. However, Robot also need vision system to find current target due a hypothesis that target might move in any time. By trial and error, I set Robot to reconfirm target every 500 iterations, in which case, he would not get lost in Labyrinth while seeking target.

Result Analysis

MEAM620 Final Project Video Channel
1. Real Time Image Segmentation

2. Failed Situation : Turned back and Walk through Obstacles
In video, we could first see DARwIn Robot Turned back. In this test, I have not set DARwIn walk towards the target direction. Based on Free Space Estimation result, DARwIn selected the 'best' way to go back home.

Almost in the last of this video, we could see DARwIn Robot walked through obstacles. This was due to failed space estimation. Theoretically, we need the minkowski sum to avoid collision. In practice, DARwIn needed to know his own size to do that.

3. Failed Situation : Lost in Labyrinth
This video also indicated the aftermath of lacking target. DARwIn could not find an available path for a long time, which I finally figured out was the problem in direction decision. As mentioned above, large free space percentage is not always good.

4. Final Presentation Video :

Future Work

Current Obstacle Detection is only based on certain color patches, which could be extended to any color obstacles. For single camera Robot, general obstacle detection will be constructed more on free space detection and filtering certain shapes that could not be considered as targets.

Also we can see from video that Robot kept moving to seek available path, which can be done by just head rotation in next step. By looking around, Robot will obtain precise positions of targets and obstacles that will be used to generate more stable path and direction.


In this project, I implemented color labeling, connected component segmentation and pixel counting on reduced sized video streaming to separate obstacles from background in real time. Then I projected obstacle key points into Robot Coordinate for obstacle maps, which had been used for motion planning along with direction information got from free space estimation. Finally I successfully applied these approaches on DARwIn Humanoid Robot and made him reach target after walking through changing obstacles.


[1] Marder-Eppsteinm, E., Berger, E., Foote, T., Gerkey, B.P., Konolige, K.: The office marathon: Robust navigation in an indoor office environment. In: ICRA IEEE (2010) 300-307
[2] Seekircher, A., Laue, T., , Rofer, T.: Entropy-based Active Vision for a Humanoid Soccer Robot, Robocup International Symposium 2010
[3] Grid-Based Occupancy Mapping and Automatic Gaze Control for Soccer Playing Humanoid Robots, unpublished.

Final Presentation

Anirudha Majumdar

Nice simple demo video! But is the walker control related to the position of soccer ball? If so, the walker might sometime lose regular paces and cause fall?

Avik De

3D configuration space plots are vivic for analysis. I like the fresh look at finding optimal path. Clear explanation, I like the demo for manipulation operator, very clear. Very impressive. Also you considered some weight lifting constraint. We might implement this in humanoid robot.

Ben Charrow

Very Impressive. Good idea to simplify a big problem into small step by step tasks and Interesting Video. I am wondering how could PR2 figure out which direction should the door open. Push or Pull?

Brian MacAllister

Just relax, You hava done a Good Job!

Caitlin Powers

How about the robustness of this adaptive controller?

Fei Miao

Good example for implementing Convex Optimization. You fully understand A* and RRT ALgorithm. For the slides, text could be more contrast compare to background. Better to have some figures to explanation basic ideas, which would be more vivid.

Jason Owens

Interesting topic. But whether the matching ideas could work on various kind of backpack? I like the surface normals of the backpack. Very impressive.

Kartik Motha

a little bit too mathematical. I can hard understand all ideas of the project.

Menglong Zhu

I like the theme of slides, but it needs some typo check. Interesting Video. How about developing some kinds of language or knowledge database for PR2?

Steve McGill

I like the keywords highlighting, just like vim. It is really good idea to bring Linus and table to the presentation, very impressive.