Visual perception

One key profile of Bosch is the development of self-driving cars. We have test cars equipped with many sensors, such as radars, lidars or cameras. Your task is to process the output of a built in camera, and calculate the distance between the camera and percepted objects.

You will be given couple of raw images coming from the camera. In these images you have to evaluate the distance between the camera and the objects. You may select the objects manually on the image by any chosen method you prefer, you do not need to actually detect the objects.


We collected different facts, that might be relevant for you:

  • You can use the pinhole camera model

  • The ground is totally flat and horizontal.

  • The intrinsic camera calibration parameters of the camera are:

    • focal length (in pixels): FocalLengthX, FocalLengthY

    • principal point offset (in pixels): PrinciplePointX, PrinciplePointY

    • axis skew: Skew

  • The extrinsic camera calibration parameters of the camera are:

    • Yaw (Z), Pitch (Y), Roll (X) (in radian): Yaw, Pitch, Roll

    • Height (Z) (in meters): Height

  • The extrinsic camera rotation parameters and camera height are given in the World coordinate system as seen in the images.

  • The ground is flat and horizontal

  • The average widths of different vehicles are the following:

    • motorcycle: 0.8m·   

    • car: 1.8m     

    • van / small truck: 2.0m

    • bus / full-sized truck: 2.4m

You can find these images here in .pgm format, along with the parameters of the camera in the following table.

Yaw 0.009196 -0.01181
Pitch -0.001926 -0.019066
Roll -0.00047 0.008226
Height 1.209 1.209
FocalLengthX 1373.67 1385.29
FocalLengthY 1386.62 1398.29
PrincipalPointX 642.062 634.743
PrincipalPointY 361.836 360.52
Skew 0 0
smart car vision perception


  • Distance evaluation with one method:
  • 4 points
  • Distance evaluation with one more method (no aggregation needed):
  • 1 point
  • Write functions for geometric transformations to be able to convert image/pixel coordinates (2D) to world coordinate vectors representing directions (3D). Verify them with unit tests:
  • 5 points


To present your solution, find the mentors of Bosch.