Professional users who are looking for software that can handle an enormous amount of data and build city-scale projects in good quality, can come to us for help. Altizure now presents this guideline as a resource for the preparation in big projects to be reconstructed on altizure.com. Please scroll through the guideline and learn more about how to prepare the input data for large scale projects.
1. Who should read this article?
- Professional users who will run a project with over 20,000 images.
- Suppliers of oblique photography capturers and professional mapping drones. For these users, please read this article carefully and follow the rules. You can send your project data directly to Altizure, and we'll ensure that the project will be processed with high-efficiency.
2. Necessary Input Images
The input images should be in different names.
- The file name extension is not case sensitive. So,
DSC0000.JPGare regarded as the same file by Altizure.
- There should be at least 3 valid image files.
- We support JPG，PNG and TIFF, but we strongly suggest that the users use JPG files with exif recording camera information.
3. Optional Input Files
pose.txt is the file including camera extrinsic parameters(camera positions and rotations). This file is only beneficial and necessary for large scale data-sets. Information of camera poses is only used to accelerate the initial block splitting of data-sets.
camera.txt is the file include camera intrinsic parameters (focal length, principle point and distortion).
group.txt is the file including camera groups (Cameras in the same group will share the same intrinsic parameters).
mask.txt is the file specifying a list of mask images and their corresponding original images.
A set of jpg or png files are used as masks in the 3D reconstruction. In each image, the whiter is a pixel, the more important is it.
Attention：If possible, we strongly suggest to write camera extrinsic and intrinsic parameters into the exif of each image.
3.1 Format of pose.txt
File Name: pose.txt
coordinatesystem local <image name> <GPS> <Pose>
<GPS> = <Latitude> <Longitude> <Altitude>
<Pose> = <Roll> <Pitch> <Yaw>
The first line
coordinatesystem local means the coordinate system of camera poses is defined by users, not the GPS coordinate system (WGS84). If the provided camera poses are in GPS coordinate system (WGS84), you can just omit this line.
<image name>is the file name of the image. Please guarantee different images has different file names. If there are spaces in names, please do use double quote mark to quote the name like
"An image name.jpg" <GPS>
<GPS> is the longitude, latitude and altitude of cameras, which is required in this file.
<Pose> is optional camera direction (roll, pitch, yaw) with unit degree, which can be omitted in this file.
For example, the following three pose.txt files are valid:
A0001.JPG 23.160948060 113.4292872428 161.6660766602 95.63634782 23.47147433 17.40054218
A0001.JPG 23.160948060 113.4292872428 161.6660766602
coordinatesystem local A0001.JPG 1.0 1.0 1.0
A0001.JPG is the image file name.
23.160948060 113.4292872428 161.6660766602 is the latitude, longitude and altitude of GPS.
95.63634782 23.47147433 17.40054218 is the camera direction (roll, pitch, yaw). The third file uses the local coordinate system defined by users.
The two files below are invalid:
A0001.JPG 0.9563634782 0.2347147433 0.1740054218
A0001.JPG 23.1609480602 113.4292872428 161.6660766602 A0001.JPG 23.1604395330 113.4293176755 161.5956573486
The first file only has the camera direction without the GPS coordinate. The second file includes two images with the same file name.
The definition of camera direction, please refer to https://sidvind.com/wiki/Yaw,_pitch,_roll_camera
3.2 Format of camera.txt
File name: camera.txt
Each line is
<image name> <focal length>
<image name> <focal length> <image center>
<image center> = <X0> <Y0>
<image name>is the image file name. Please guarantee different images has different file names.
<focal length>is the focal length of camera with unit pixel.
<image center>is the principle point of camera with unit pixel. The origin of coordinates is the top left corner of image, the x axis is to the right and the y axis is to the down.
Below is the file including images with resolution 5616*3744:
A0001.JPG 6845.25 2806.245 1976.354 A0002.JPG 6845.25 2806.245 1976.354
3.3 Format of group.txt
File name: group.txt
<image name> <group ID>
<group ID> is the group index of group one image belongs to, which is an integer number no less than 0. Images in the same group means those images are captured by the same camera (has the same intrinsic parameters). Images captured by different cameras with the same model should be divided in different groups. If
group.txt is uploaded, the group of each input image in the input images must be listed in the
group.txt. Otherwise, the 3D reconstruction of Altizure will fail.
This example shows six images are divided in three groups:
A0001.JPG 1 A0002.JPG 1 B0001.JPG 2 B0002.JPG 2 C0001.JPG 3 C0002.JPG 3
3.4 Format of mask.txt
File name: mask.txt
<image name> <mask name with extension>
<mask name with extension> This name with extension is case sensitive and must match exactly the file name of the mask image. Any images in the input folder not matching the mask names in
mask.txt will be used as input images. The mask image should be in jpg or png format. The pixel value 0 is for useless, others are for useful. We suggest the image dimension is exactly identical to the original image. Otherwise the mask is resized to match the original image dimension.
4. Supported EXIF/XMP
As we recommended, if all the camera information is written into exif/xmp data of jpg images, then Altizure can detect and extract all parameters from exif.
- Exif Tag
- Altizure Specific XMP Tag
- altizure:CalibratedFocalLength (double in pixels)
- altizure:CalibratedOpticalCenterX (double in pixels)
- altizure:CalibratedOpticalCenterY (double in pixels)
- altizure:CalibratedRadialDistortionK1 (ref to opencv radial distortion)
- Exif Tag
GPSAltitudeis the absolute altitude above the sea level in meters, which is typically measured by a barometer and can be quite inaccurate in practical use.
GPSRelativeAltitudeis the relative altitude above the point where a UAV takes off. The measurement is generally much more accurate than
GPSAltitude, since the vehicles like DJI UAVs usually use different sensors like IMU and visual tracking system to give a fused estimation of it. Therefore, we use this as the altitude measurement in our pipeline. The absolute altitude of the take-off point needs to be added to the altitude of the output model to give absolute altitude measurements of our model.
- DJI Specific XMP Tag:
According to the documentation of DJI Phantom4 RTK,
RtkFlagmarks the types of RTK solutions: 16 means point positioning (of around 3cm accuracy), 34 means float solution (of 10~40 cm accuracy) and 50 means fixed solution (of around 5cm accuracy). We also found that M210 RTK V2 may have a
RtkFlag=1. For now, we take the GPS values as accurate RTK measurements if
RtkFlag=1, 16 or 50.
RtkStdHgtare the standard deviations of the GPS measurement.
- Altizure Specific XMP tag:
altizure:LocalX (double in meters)
altizure:LocalY (double in meters)
altizure:LocalZ (double in meters)
altizure:StdX (double in meters)
altizure:StdY (double in meters)
altizure:StdZ (double in meters)
altizure:GravityX (double in meters/second^2)
altizure:GravityY (double in meters/second^2)
altizure:GravityZ (double in meters/second^2)
LocalZspecify the spatial coordinates of an image in a local right-hand-side (RHS) cartesian coordinate system. SRS (spatial reference system) can use to define more specifically how such local cartesian coordinate related to the globe.
StdZare the standard deviation of the estimation of
LocalX|Y|Z, if they are provided and are small enough, e.g. less than
0.05, but greater than
0, our engine will use
LocalX|Y|Zto constraint the 3D reconstruction. Otherwise, only a similarity transformation is estimated to align the up-to-scale 3D reconstruction to the local coordinate system.
GravityX|Y|Zcan help our engine to align the reconstructed model to the gravity direction, but it cannot help to recover the absolute scale of the reconstructed 3D model. Such a gravity direction can usually be obtained from IMU or visual inertial algorithms. Multiple measurements of
GravityX|Y|Zfrom the xmp information of a few images will be averaged to get a dominant gravity direction. If
GravityX|Y|Zare provided at the same time,
GravityX|Y|Zwill be ignored, because
LocalX|Y|Zcan provide both the gravity (up) direction and the absolute scale.
Extrinsicis the extrinsic matrix for each image, which is a string encoding the 4x4 matrix in a row-major fashion. For example,
"1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1"is a valid
Extrinsicinput. Note that this string always ends with "0 0 0 1" or something similar (double floating-point numbers). Any string with less than 16 double numbers will be considered as invalid extrinsic matrix.
We have four ways of specifying the camera pose (extrinsics) using either GPS information or user-generated pose data, namely
LocalX|Y|Z. Typically only one of the pose information is given by users. In some cases, however, If more than one fields are provided and valid at the same time, the priority for using them is
Fill in this camera meta information as much as possible. They are very helpful for our engine to divide the input images into a few camera groups each of which shares the same set of camera intrinsic in the 3D reconstruction. Our engine will automatically create a new group for an image with any of these fields,
Lens and focal length, are different. The focal length is an estimated focal length in pixels from
drone-dji:CalibratedFocalLength, or from