【高翔】視覺SLAM十四講


我用數(shù)據(jù)集又做了一個(gè),成功。
數(shù)據(jù)集地址:https://vision.in.tum.de/data/datasets/rgbd-dataset/download#freiburg1_desk
下面這個(gè)的一部分
Sequence 'freiburg1_desk'
This sequence contains several sweeps over four desks in a typical office environment (there is also a second sequence available called desk2 from the same four desks).
這是參數(shù):
Color images and depth maps
We provide the time-stamped color and depth images as a gzipped tar file (TGZ).
The color images are stored as 640x480 8-bit RGB images in PNG format.
The depth maps are stored as 640x480 16-bit monochrome images in PNG format.
The color and depth images are already pre-registered using the OpenNI driver from PrimeSense, i.e., the pixels in the color and depth images correspond already 1:1.
The depth images are scaled by a factor of 5000, i.e., a pixel value of 5000 in the depth image corresponds to a distance of 1 meter from the camera, 10000 to 2 meter distance, etc. A pixel value of 0 means missing value/no data.
Ground-truth trajectories
We provide the groundtruth trajectory as a text file containing the translation and orientation of the camera in a fixed coordinate frame. Note that also our automatic evaluation tool expects both the groundtruth and estimated trajectory to be in this format.
Each line in the text file contains a single pose.
The format of each line is 'timestamp tx ty tz qx qy qz qw'
timestamp (float) gives the number of seconds since the Unix epoch.
tx ty tz (3 floats) give the position of the optical center of the color camera with respect to the world origin as defined by the motion capture system.
qx qy qz qw (4 floats) give the orientation of the optical center of the color camera in form of a unit quaternion with respect to the world origin as defined by the motion capture system.
The file may contain comments that have to start with "#".
Intrinsic Camera Calibration of the Kinect
The Kinect has a factory calibration stored onboard, based on a high level polynomial warping function. The OpenNI driver uses this calibration for undistorting the images, and for registering the depth images (taken by the IR camera) to the RGB images. Therefore, the depth images in our datasets are reprojected into the frame of the color camera, which means that there is a 1:1 correspondence between pixels in the depth map and the color image.
The conversion from the 2D images to 3D point clouds works as follows. Note that the focal lengths (fx/fy), the optical center (cx/cy), the distortion parameters (d0-d4) and the depth correction factor are different for each camera. The Python code below illustrates how the 3D point can be computed from the pixel coordinates and the depth value:
fx = 525.0?# focal length x
fy = 525.0?# focal length y
cx = 319.5?# optical center x
cy = 239.5?# optical center y
factor = 5000 # for the 16-bit PNG files
for v in range(depth_image.height):
?for u in range(depth_image.width):
??Z = depth_image[v,u] / factor;
??X = (u - cx) * Z / fx;
??Y = (v - cy) * Z / fy;