File splitting and compression¶
As explained in the gateway abstraction section, DataSync can use the file-system to store the replicated into files and copy them to other sites using any tool.
It is important to keep in mind that the size of these files can be considerable, extending the time needed to transport them. For this matter, DataSync performs some operations to prepare them before transportation:
Compression¶
Reduces the size of the data to transport by compressing files into zips with the "DSPackage" extension.
Decompression¶
The reverse process in which "DSPackage" is decompressed, obtaining the original files to be used in an import process.
Splitting¶
Divide large files into smaller "DSPart" files, along with a "DSIndex" which includes metadata of all of them. The size of each individual part can be configured in the gateway options.
Combining¶
Reads the information of the "DSIndex" in order to validate that all the "DSPart" files needed are present and then combines them into the original file.
Integrity check¶
This process aims to check the integrity of the files to know if they are
damaged. To achieve that, DataSync calculates a
crc32 code based on the
information (bytes).
This code is generated during the splitting process and written into the file it
belongs to.
Then during the combining process, the code is calculated again based on the
file and compared against the one written in it. If these codes do not match,
that means the file was altered/damaged.