Monocular Visual Odometry

For a while now I have been looking for ways to use (computer) vision to get odometry information. This is by no means a new concept. Here are some examples (by no means a comprehensive list):

  • NASA used visual odometry on Mars: Two Years of Visual Odometry on the Mars Exploration Rovers(pdf)
  • Optical computer mice employ it and optical mouse sensors have been used in robotics; e.g. Outdoor Downward-facing Optical Flow Odometry with Commodity Sensors (pdf)
  • Insects leverage it. E.g., see the references in Robust Models for Optic Flow Coding in Natural Scenes Inspired by Insect Biology (pdf)

However, these solutions typically require complex and expensive setups. A little more than a month ago I stumbled over a paper by Jason Campbell, Rahul Sukthankar, Illah Nourbakhsh, and Aroon Pahwa explaining how a single regular web cam can be used to achieve robust visual odometry: A Robust Visual Odometry and Precipice Detection
System Using Consumer-grade Monocular Vision
(pdf). Naturally this got me hooked. To make sure that I fully understand the presented algorithm I reimplemented it in C# and paired it with a fairly comprehensive WinForm based UI. Here is a screen shot of the result:

VisualOdometryScreenshot

The source code is available from http://code.google.com/p/drh-visual-odometry/ and is covered by the GPL 3.0 license.

More details can be found in the presentation slides (pdf) that I presented during last Saturday’s meeting at theRobotics Society of Southern California.

To facilitate testing the code without an actual setup I uploaded my test video (61MB!).

20 thoughts on “Monocular Visual Odometry”

  1. Dear Dr. Rainer Hessmer,

    I am thinking about porting your program to C and integrate it with ROS: http://www.ros.org/, but I am not quite familiar with openCV, the algorithm itself and the architecture of your program. Can you help by answering my questions when they arise?

    Thanks,

  2. Dear Anh Nguyen,

    This is an exciting plan. I am very aware of the powerful ROS framework but unfortunately have had only enough time to play with the tutorials so far. Back to your plan … Rather than porting my C# source code I suggest you start with the original C code that the authors of the paper http://www.cs.cmu.edu/~rahuls/pub/icra2005-rahuls.pdf published. The link to the source code in the document no longer works but one of the authors pointed me to the new location: http://www.cs.cmu.edu/~vo/vo-0-2.tgz

    I am happy to help out by answering questions.

    Regards,
    Rainer

  3. Thanks a lot for this project!!
    I don´t understand how the process to calibrate my own camera.
    Could you explain?
    Thanks again

  4. Hi Dr Rainer!

    I want some explanations about BirdsEyeView.UI.
    Where should i insert parameters of distance from the chess board to the camera.
    I’ll appriciate if you may write some word how ececute this calibration and what the indication that the tranformations have been done correctly?

    Thank you very much
    Hananya Segal

  5. Hi Hananya Segal,

    For the bird’s eye view the distance of the nearest edge of the grid from the camera is specified in line 84 of the source file http://code.google.com/p/drh-visual-odometry/source/browse/trunk/src/BirdsEyeView.UI/MainForm.cs:


    bottom = 310.0f;

    You execute the calibration simply by running the BirdsEyeView.UI application after changing the line bottom = … appropriately. The calibration file is saved to the file “BirdsEyeViewTransformationForCalculation.txt” in:

    HomographyMatrixSupport.Save(m_BirdsEyeViewTransformationForCalculation, “BirdsEyeViewTransformationForCalculation.txt”);

    A good indication for a correct projection is that the grid in the projection is rectangular rather than trapezoidal.

    Regards,
    Rainer

  6. Hi Dr Rainer!

    Thank you very much for your great answer.
    I want to ask one more thing what’s the meaning of the following variables:

    side = 8f;
    bottom = 700.0f;
    centerX = (float)m_CameraParameters.Intrinsic.Cx;

    Thank you very much
    Hananya Segal

  7. Recently I got very similar questions via email. I am sharing them with the corresponding answers here. They cover your question as well:

    1. What is the procedure for creating the BirdsEyeViewTransformationForCalculation/UI.txt files? Are they needed for the odometry calculation or only for “birds eye view” display purposes?
    2. I tried creating these files via the BirdsEyeView.UI project, however there are some constants that I don’t understand in the MainForm.cs file:
    side = 25.0f; – I guess it’s the size in mm of each grid side
    side = 8f; – the number of grid cells? (and if so, why not use the this.ChessBoard.PatternSize.Height variable?)
    -3, +3 (appears multiple times as -3*side, +3*side, etc.) – ?
    bottom = 310.0f – ?
    bottom = 700.0f – ?
    3. How should the chessboard image be taken for this project to work? Must it be flat on the floor? Should it be a certain distance from the camera?
    4. Does the camera height above the ground matters? If so, where should I insert it?

    Answers:

    1) BirdsEyeViewTransformationForUI.txt is only needed for display purposes. The projection is set up so that the left and right sides are symmetrically arranged around the center of the camera image (m_CameraParameters.Intrinsic.Cx). See line 97 to 103 in http://code.google.com/p/drh-visual-odometry/source/browse/trunk/src/BirdsEyeView.UI/MainForm.cs starting with:

    centerX = (float)m_CameraParameters.Intrinsic.Cx;

    The bottom is selected so that the resulting transformed image displays nicely.

    2) My grid has 8 times 10 squares and hence 7 times 9 inner corners. The ‘side’ variable is the side of one square of the grid. In my case I placed the grid in front of the robot so that the first line of inner corners was 310 mm away from the camera’s projected location on the ground plane.

    side = 25.0f;
    bottom = 310.0f;

    PointF[] physicalPointsForCalculation = new PointF[4];
    physicalPointsForCalculation[0] = new PointF(-3 * side, bottom + 8 * side);
    physicalPointsForCalculation[1] = new PointF(+3 * side, bottom + 8 * side);
    physicalPointsForCalculation[2] = new PointF(-3 * side, bottom);
    physicalPointsForCalculation[3] = new PointF(+3 * side, bottom);

    Essentially you tell the transform logic where the four outermost inner grid points are located in the physical coordinate system. You are right I should have used this.ChessBoard.PatternSize.Height instead.

    3) To measure the translation of the robot the birds eye view is used. For the calibration of this view the camera needs be mounted on the robot in its final position. The chess board needs to lay flat on the floor in front of the robot. Ideally the surface of the chessboard should be level with the ground. The distance from the robot should be so that the edge of the chess board that is closest to the robot is still fully contained in the camera view.

    4) The height of the camera above ground does not matter. It is implicitly calculated by the perspective transform.

  8. Hi Dr Rainer!

    I still don’t understand what’s the meaning of the second bottom = 700.0f.
    I understood the first bottom = 310.0f is the distance from the camera to the chessboard.

    Thank you very much.

    Hananya Segal

  9. The second bottom=700.0f and the following lines set up the perspective transform m_BirdsEyeViewTransformationForUI purely for display purposes. This transform is not used in the odometry calculations.

  10. hi Rainer,
    i want to do the localization you implemented using kinect camera.
    i used the lucas kanade algoritem to find pints to track and the distance from the pixels using the kinect depth.
    then i used affine transform to transform the points to xy coordinates , and took 3 circles and calculate the intersection point to
    get my current position.
    I know i will get a deviation , but there is a way to get the location sort of.
    do you think this idea can work?

  11. Dear Raner,
    I compiled the whole projects with success. But if I execute any *.exe, it doesn’t work:
    applications start and terminate immediatly without errors or warnings (I can’t see any window on screen or any message on consolle).
    I’ve also added a basic instruction inside Main() function of BirdsEyeView.UI/Program.cs :
    System.Console.WriteLine(“Starting BirdsEyeView!”);
    I’ve rebuild and restarted BirdsEyView.UI.exe but don’t show anything…

    Is there a execution order to launch all the executables?

  12. Hello dear Dr. Rainer Hessmer
    I saw your project slides and it seems useful to me but unfortunately I can’t download the project
    please let me know How I can download it.
    thanks a billion

  13. Hi Dr.Rainer,
    I am trying to implement monocular visual odometry in opencv python. I calculated optical flow using cv2.goodFeaturesToTrack and cv2.calcOpticalFlowPyrLK. I am uncertain what my next step should be. I need to calculate the distance moved (in real world). How can I calculate that from optical flow? Can you point me in the right direction? Any help would be appreciated. Thankyou.

  14. Hi.
    Is it possible to use I/O from 3D FPS game instead of live feed from webcam?
    I want to beat 3D maze in my 2004 game.
    🙂

  15. AOA..Dr .Rainer..Thanks aloot for this project..i have been new to open Cv ..so it will be really greatfull if you kindly guide me how to. Implement this code on open Cv..

  16. Dr. Rainer, thanks a lot for this project. I am working on visual odometry so I really wanted to try your application so I downloaded it but I have some problems to build and/or execute it. I was wondering if you could guide me to properly set it up or if you have another version of the program that can be downloaded without it being the SVN version.

    Thank you very much, greetings.

Leave a Reply

Your email address will not be published. Required fields are marked *